• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

A64 CPUs, chipsets, motherboards

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.
Mobile A64 754 (update)

There are 1.5 V (81.5W), 1.4V (62 W) and 1.2 V (35 W) Mobile A64.

The 1.2 V 35W is just listed (May 04).

The 1.5 V is listed for desktop replacement explicitly. But all of them are listed as package B, so they have the same package, potentially can be used for desktop socket.

The 1.2 V is from NewCastle core, FC0h, w/ 512 KB L2.
The 1.4 and 1.5 V are from ClawHammer, F4Ah, w/ 1 MB L2.

All three have CG revsion.

These are the models:

2700+ 1600 MHz L2 512 KB, 1.2V
2800+ 1800 MHz L2 512 KB, 1.2V
2800+ 1600 MHz L2 1 MB, 1.4V
3000+ 1800 MHz L2 1 MB, 1.4V, 1.5V DTR
3200+ 2000 MHz L2 1 MB, 1.4V, 1.5V DTR
3400+ 2200 MHz L2 1 MB, 1.5V DTR
 
Last edited:
New Opteron 150, 250, 850, announced May 18, 2004.
Socket 940
Rated at 2.4 GHz
130 nm SOI

Opteron 150 is considered equivalent to FX-53.

The 250 and 850 are almost the same as the 150, each has three 16-bit 800 MHz HT links, except that the HT links in 150 are not coherent, and the 250 has one coherent HT link and the 850 has three coherent HT links. A coherent link is used for connecting other processor for building multi-processor systems.

http://techreport.com/reviews/2004q2/opteron-x50/index.x?pg=1
 
Last edited:
Reviews on memory modules for A64 platforms

Here links to informations on memory modules for A64 754, 939 platforms. Will add and update this post over time.


1. A favorable review of OCZ3700EB memory modules 2x512MB on a Chaintech VNF-250 motherboard (Nforce3 250 chipset) w/ a A64 754 3200+. The memory module, rated DDR446, achieved a max stable speed of DDR550 (275 MHz) CAS3, using 2.8-2.9V (more room to go with higher voltage plus cooling). Separately, it achieved DDR524 on an Intel system. .... On the A64, tested 200x10 2.5-2-3-10 (4xHT), 214x10 2.5-2-3-10 (4xHT), 238x9 3-2-3-10 (3xHT), 250x8 3-2-3-10 (3xHT), 268x8 3-2-3-10 (3xHT), BW of 4137 MB/s (int/float) obtained, highest memory speed 275x7 (lower CPU) achieved, ....
http://www.anandtech.com/memory/showdoc.html?i=2057&p=1

Wanting to see how the dual channel 939 would perform running at such high memory speed, ....


2. Anandtech testing of
- Corsair 3200XL Pro
- Crucial Ballistix PC3200
- Kingston HyperX PC3200 Low-Latency
- Mushkin PC3200 Level II V2
- OCZ PC3200 Platinum Rev. 2

The review recommends Crucial Ballistix or OCZ 3500EB/3700EB as first choice for A64, tested to DDR500 speed. The Ballistix and OCZ EB are based on Micron DRAM chips. It also said the OCZ PC3200 Platinum Rev. 2 was the only Samsung memory (TCCD?) to work reliably at DDR500.

http://www.anandtech.com/memory/showdoc.aspx?i=2145&p=1
 
Last edited:
now how come the opterons werent included in this, btw WOW very nice post soooo much info to look at, hard to decifer alot of the best stuff, but ill keep on checking back for the info,,, very nice thread,, overwhelming to me!!
 
unreal said:
now how come the opterons werent included in this, btw WOW very nice post soooo much info to look at, hard to decifer alot of the best stuff, but ill keep on checking back for the info,,, very nice thread,, overwhelming to me!!


The main emphasis is on A64, A64 FX, A64 mobile with the socket variants 754, 939 and 940 that most of us would be using for building computers, ....

Opteron is mentioned in the context of socket 940 and A64 FX.

An Opteron is basically an A64 FX (1 MB L2, dual channel memory controller w/ 128-bit memory bus) with the same internal core (SledgeHammer) with a few extra capabilities such as the coherent HT links used for connecting to other processors via HT bus for building multi-processor system.

I may add more things explicitly to Opteron when I add and update the thread from time to time. Thanks for the suggestion.
 
Last edited:
As of 05/04, the MSI K8N Neo Platinum based on Nforce3 250 GB chipset is available.
06/04, the EPoX 8KDA3+ (250 GB 754) is also now available.

This link gives some comparisons of various 754 250 GB motherboards:
250 GB Motherboards (754, 939)



A 250 GB motherboard coupled with an A64 ClawHammer/NewCastle, such as a Mobile A64 3000+/3200+ 1 MB L2, 1.4 V, CG revision (currently available), should give a very cost effective sytem. IMO, even a better choice than Nforce2 + Mobile Barton, for about $150 more but getting 20-30% performance gain running at the same overclocking frequencies.

E.g. Mobile 1.4 V, DTR 1.5 V, desktop 754 CG ClawHammer/NewCastle
2800+ around $180
3000+ around $220
3200+ around $280
motherboard around $150

940, 754, 939 CPU models and specifications

Typical Overclocking Systems for 754, 939
 
Last edited:
Dual-core

OPN code
Desktop A64 X2 939 (90 nm SOI DSL) Toledo
4400+: ADA4400DAA6CD 1.35/1.4V (JH8 E6 rev, 00020F32h) Toledo, 2x 1 MB L2, 2.2 GHz, x11, 110 W
4800+: ADA4800DAA6CD 1.35/1.4V (JH8 E6 rev, 00020F32h) Toledo, 2x 1 MB L2, 2.4 GHz, x12, 110 W

Desktop A64 X2 939 (90 nm SOI DSL) Manchester
3800+: ADA3800DAA5BV 1.35/1.4V (BH8 E4 rev, 00020FB1h) Manchester, 2x 512 KB L2, 2.0 GHz, x10, 89 W
4200+: ADA4200DAA5BV 1.35/1.4V (BH8 E4 rev, 00020FB1h) Manchester, 2x 512 KB L2, 2.2 GHz, x11, 89 W
4600+: ADA4600DAA5BV 1.35/1.4V (BH8 E4 rev, 00020FB1h) Manchester, 2x 512 KB L2, 2.4 GHz, x12, 89 W

comparing to San Diego 4000+ rated 2.4 GHz 1 MB L2
(the second core in X2 gets an 800+ PR rating)


Link to dual core articles:

AMD's Athlon 64 X2 3800+ processor, Let the upgrades begin (from Techreport)

Affordable Dual Core from AMD: Athlon 64 X2 3800+ (from Anandtech)

An interesting article about
Intel "Pentium D" and "840 Extreme Edition" Dual-Core CPUs,
with comparision to AMD X2 (4800+) on power, various benchmarking results and overclocking, ....
Performances of dual core with HT (Pentium 840EE), dual core without HT (Pentium D, X2), single core with HT (P4) and single cores without HT (A64) are compared.
http://www.overclockers.com.au/article.php?id=384519

Anandtech's article on AMD dual core (April 2005)
http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2397&p=1

Hardocp X2 preview
http://www.hardocp.com/article.html?art=NzY2


Background

AMD to Release First x86 Multi-Core Processors mid-2005
http://www.amd.com/us-en/0,,3715_11787,00.html?redir=CPPA64

A dual-core A64 is a single chip (die) comprising two A64 cores. The inter-processor delay or commuication latency is much reduced than putting two separate processors on a motherboard (as in the dual-processor motherboard system), and hence much better overall performance for a dual-core sytem, as well as less component counts.

The 90 nm shrink enables this as more transistors can be put into a given die size. Die size cannot be made too big for yield reason, ideally die size < 100 mm2, at most 150 mm2 typically.

In dual-core die, each core will have its own L1 and L2 caches. The two cores on the same die are communicating through internal cross-bar switches, so it is an extension of the current single core A64 architecture. Current, single core Opteron are connected via coherent HT links to form MP system. The dual-core communicates to the rest of the system via the on chip memory controller and the HT link(s).

Dual-core is different from dual/multiple-excutation unit core where two/multiple-execution units are in the same die, sharing the same cache as in SMT (hyper threading) architecture. In such architecture, it turns out that two execution units may compete data in the cache and slow down the overall performance at the cache level in some instances, if task cannot be partitioned and subtasks cannot be scheduled "properly" to the dual/multiple execution units.

The dual cores A64 will share the same memory controller (which is already on die for low latency), so the two cores may still compete for the main memory (memory contention) when both cores have cache misses, but this occurs much infrequently than the SMT execution units competing for cache data. With the low latency memory controller, the dual-core with self-cache architecture should perform much better than a core architected with dual/multiple execution units under SMT.

Intel recently also announced plans for dual-core chips, and is also considering on-die memory controller.

Speculation:
There is dual-core Opteron demo chip which is socket 940 compatible.
A 90 nm dual-core 939 may well be still a socket 939 CPU which requires only bios upgrade and whose cooling requirement is not much higher than a 130 nm single core. This is another advantage for a 939 system for future compatibility and upgradability.


From this article, according to AMD's chief Hammer Architect's (Kevin McGrath) in the Fall Processor Forum, some dual-cores are 939-compatible.
http://www.extremetech.com/article2/0,1558,1666609,00.asp
from above article said:
...
Each dual-core chip will require about 205 million transistors, McGrath said. However, fabricating the chip in a 90-nm process will limit the die size to about that of a 130-nm Opteron; die size is a key determinant of a chip's cost. Finally, the dual-core chip will maintain socket compatibility with AMD's 939-pin processors, he said.



Update: AMD Tips Dual-Core Details, Performance (October 5, 2004)
http://www.extremetech.com/article2/0,1558,1666805,00.asp

AMD's Dual Core 90nm Opteron Demonstration Dissected (from AMDZone)

AMD Rev E Dual Core CPUs Info (from vr-zone)

AMD Flashes Dual-Core Microprocessors (from xbitlabs)

AMD Targets to Counter Strike Intel with Dual-Core Chips (from xbitlabs)
 
Last edited:
Wow my brain is smoking over this wealth of info, keep it going!! Hope this helps me get the most out of my Chantech NF3 250 non GB version.

CK
 
Thank you, hitechjb1! Excellent information. I'll be able use it in 6 months to 1 year when I'm going to get the upgrade itch again. I just recently took the Mobile XP path, so I have to stay put for awhile. Anyway, it looks like the FX53 is only 10% to 30% faster than my current setup, depending on the benchmark. Hopefully in 6-12 months I can get something 50% faster for less $$.
 
Summary as of June 03, 04.

This post gives some typical examples on 754, 939, 939 FX systems (scroll down to end of the post)
http://www.ocforums.com/showthread.php?p=2748988#post2748988

This post gives the performance tradeoff between the various A64 platform, barton, P4's.
http://www.ocforums.com/showthread.php?p=2748998#post2748998



A review comparing some K8T800 Pro and Nforce3 250 GB motherboards for 754 platform.
- Abit KV8 PRO (VIA K8T800 PRO)
- Chaintech VNF3-250 (nVidia nForce3-250)
- Epox 8KDA3+ (nVidia nForce3-250Gb)
- Gigabyte K8NSNXP nVidia nForce3-250)
- MSI K8N Neo Platinum (nVidia nForce3-250Gb)
- nVidia nForc3-250Gb Reference Board
It gives editor choice to the Epox 8KDA3+, 2nd choice to both Chaintech VNF3-250 and MSI K8N Neo Platinum
http://www.anandtech.com/chipsets/showdoc.html?i=2063


This is a summary of Desktop A64, Mobile A64 DTR and Mobile A64, A64 FX for socket 940/754/939.
940, 754, 939 CPU models and specifications

Typical Overclocking Systems for 754, 939
 
Last edited:
Difference between a ClawHammer and a NewCastle

Based on the model number/OPN code, there are ways to tell a CG revision. Here lists all the CG revision for 754, 939 ClawHammer and NewCastle. By reading the CPU model number or from sellers' listing, one can tell whether a CPU is a CG revision.
940, 754, 939 CPU models and specifications

Simply put,
a CG rev. ClawHammer has 1 MB L2 cache.
a CG rev. NewCastle has 512 L2 cache.

There are two questions commonly asked:
1. Which would overclock better?
Most believe NewCastle may be able to overclock 100+ MHz better. (IMO there is still not enough data to completely confirm that.)

It would take a NewCastle running 100-150 MHz faster, equivalent to 4-6% at 2.5 GHz level, to break even with a ClawHammer (when system running the same frequencies of memory and system bus) for average performance due to the difference in L2 cache size.

2. For the same frequencies of CPU, memory bus, system bus (HT), ..., how to compare the performance. This post discusses that:
How to compare ClawHammer and NewCastle


Appendix:

Impact of cache size on performance

In more details, the performance difference between the two CPU's with 512 KB and 1 MB L2 can range between 0 to 10+% when both systems are clocked to the same frequencies, over a range of different types of applications such as CPU intensive (requiring big enough cache size), CPU intensive requiring small cache size (about 0% advantage), memory intensive, games, scientific (folding), .... The 3% to 5% usually quoted between the two is just a brief and average representation of the situation.

For gaming applications, the bigger L2 indeed provides few % (say 3-9%) performance advantage at same frequencies, as shown in the following links which show some details analysis and breakdown of various applications between the two.

The performance impact of cache size on various applications, benchmarks, processor performance, ... has been well studied in industry and academia, and there is no single number to describe all. It boils down to specific type of applications.

As a simple rule and trend, a few % overall performance advantage when L2 cache size is double. And such % becomes smaller as both cache size increase at the same ratio.

Benchmark analysis of some A64, Barton, P4 (part 2)

Benchmark analysis of some A64, Barton, P4 (part 1)


Cache and CPU performance
 
Last edited:
Last edited:
How to compare ClawHammer and NewCastle

The choice between ClawHammer and NewCastle has been discussed a lot recently. I think we have to be careful about claiming which is better. This is not a matter of subjective choice and claim. Supporting numbers speak for themselves, and will in turn help to make the right decisions in building systems.

1. If both CPU's are put in SAME condition of operation, i.e. same frequencies of CPU, memory bus and timing, system bus, cooling, then from theoretical analysis and actual measurment, the double-sized cache CPU such as the 1 MB L2 ClawHammer outperforms the smaller cache CPU such as the 512 KB L2 NewCastle, by few % on the average, some more and some less, and some programs whose code and data can be fitted in cache may be even 0%.

2. Under the SAME frequencies of CPU, memory bus and timing, system bus, cooling, both of 939/940 dual channel have 80% more effective memory bandwidth than the 754, hence they are always better in performance, especially for memory intensive programs such as video and image streaming, applications using spatially structured data as in scientific computation, up to 20-80% higher performance (e.g. PCmark02 memory test, Sandra memory bandwidth, Sciencemark Stream, many scientific programs). For video, image streaming, data needs to be refreshed constantly from the main memory (L3) to the on chip L2 via the memory bus as size of data >> L2 size at any given time. Under such situation, the high dual channel memory bandwidth delivers a marked performance advantage.

3. For comparing CPU, we have to know their overclockability. If a smaller cache one can overclock higher beyond certain frequency to get back the loss in performance due to cache size, then the conclusion can be different.

E.g. a good example is comparing Barton (512 KB L2) and Tbred B DLT3C (256 KB L2), the answer can be either way, depending on the CPU type and stepping.
Tbred B is slightly better or about the same as the 1st generation of desktop Barton since at one time the former can be clocked 150 - 200 MHz higher in absolute terms under the same cooling.

But later, mobile Barton obviously performs better than Tbred B DLT3C as the mobile can be clocked 150 - 200 MHz higher, in additional to the bigger L2.

So don't mix up
- the intrinsic advantage of cache size and memory bandwidth under the same condition of operating frequencies (and cooling), and
- the performance leverage delivered by certain type of CPU due to its overclockability.



Appendix:

Impact of cache size on performance

In more details, the performance difference between the two CPU's with 512 KB and 1 MB L2 can range between 0 to 10+% when both systems are clocked to the same frequencies, over a range of different types of applications such as CPU intensive (requiring big enough cache size), CPU intensive requiring small cache size (about 0% advantage), memory intensive, games, scientific (folding), .... The 3% to 5% usually quoted between the two is just a brief and average representation of the situation.

For gaming applications, the bigger L2 indeed provides few % (say 3-9%) performance advantage at same frequencies, as shown in the following links which show some details analysis and breakdown of various applications between the two.

The performance impact of cache size on various applications, benchmarks, processor performance, ... has been well studied in industry and academia, and there is no single number to describe all. It boils down to specific type of applications.

As a simple rule and trend, a few % overall performance advantage when L2 cache size is double. And such % becomes smaller as both cache size increase at the same ratio.

Benchmark analysis of some A64, Barton, P4 (part 2)

Benchmark analysis of some A64, Barton, P4 (part 1)


Cache and CPU performance
 
Last edited:
In H1, 2005, there will be 2GHz and 2.2GHz Socket 754 Athlon 64's (w/ 1MB L2 Cache) that are likely to be 90nm and will have significantly lower TDP's (~35W) then current desktop and DTR Athlon 64's, which have TDP's of 89W and 81.5W respectively.

AMD.jpg


AMD will be introducing new 'Oakville' and 'Lancaster' low-power mobile Athlon 64's cores, both may well be 90nm. Current low-power mobile Athlon 64's are based on a 130nm 'Odessa' core, which, back in September, was said was going to be 90nm. 130nm 'Odessa' low-power mobile Athlon 64's are rated at a TDP of 35W.

With AMD seemingly committed to keeping it's mobile processors on it's single-channel Socket 754 (though DTR and higher-powered mobile's are likely to come out for dual-channel Socket 939), the 3700+ may not be the "end-of-the-line" for Socket 754 desktop upgrades.
 
Power states of A64 desktop, mobiles (DTR, 1.4V, 1.2V) for 754 and 939

F4A is part of the OPN code for CG ClawHammer 754, which spans desktop, mobile DTR, mobile 1.4 V

FC0 is part of the OPN code for CG NewCastle 754, which currently spans desktop and mobile 1.2 V. Have not seen for mobile DTR and mobile 1.4 V.

Mobile DTR, mobile 1.4V and mobile 1.2V for 754 are "official" name used in AMD tech doc for A64 mobile CPU's.


Apart from the obvious voltage, frequency, power rating for the desktop and the various mobile CPU's.

Each type of mobiles has its own voltage, power, low-power states specification and power up/down sequences. For details, read/study the details in the AMD tech doc for each of them. As such, it requires certain bios and motherboards to handle them, .... As more bios and motherboards mature, I hope/expect more and more bios would be able to handle them.

Actually, based on reading the AMD tech doc (unless there are typo or missing informations),
the low power state of desktop A64 is different from the mobile DTR, mobile 1.4V and mobile 1.2V. The latter three have the same low-power states.

The 939 desktop also has different low-power state from the 754 desktop.


Desktop 754:
Max P-state,
Intermediate P-state #1,
Intermediate P-state #2,
Min P-state,
Halt/Stop Grant Max P-state,
Halt/Stop Grant Min P-state,
S3

Mobile 754 DTR, 1.4V, 1.2V:
Max P-state,
Intermediate P-state #1,
Intermediate P-state #2,
Min P-state,
Halt/Stop Grant Max P-state,
Halt/Stop Grant Min P-state,
C3/S1 Min P-state,
S3

Desktop 939:
Max P-state,
Intermediate P-state #1,
Intermediate P-state #2,
Intermediate P-state #3,
Min P-state,
Halt/Stop Grant Max P-state,
Halt/Stop Grant Min P-state,
S3


940, 754, 939 CPU models and specifications


Appendix:

Quote from AMD technical document "AMD Functional Data Sheet, AMD Functional Data Sheet, 754 Pin Package"

A64_754_powerstate.JPG


939 does not support ACPI state C2, C3 (as of June 2004 doc)
940 does not support ACPI state C2, C3 (as of June 2004 doc)
 
Last edited:
Relationship between CPU_memory_divider and CPU_multiplier, memory_HTT_ratio

Here FSB is used interchangeably with HTT, in the various settings for A64. In A64, FSB or HTT is a CPU frequency signal, NOT a physical bus. The system bus external to the CPU is the HT (HyperTransport bus).

memory_bus_frequency = CPU_frequency / CPU_memory_divider
or
memory_bus_frequency = CPU_multiplier x HTT / CPU_memory_divider
where
CPU_memory_divider = ceiling(CPU_multiplier / memory_HTT_ratio)
where memory_HTT_ratio = 1/1, 1/2, 2/3, 3/4, 4/5, 5/6, 7/8, 9/10, ... (availability of somel settings is bios/motherboard dependent)

Impact of CPU multiplier on memory divider flexibility

For A64, since the memory divider available depends on the CPU multiplier, it is advantageous to know what memory modules to use in order to plan for what CPU and its multiplier to get.

In general, due to the discrete nature of CPU multiplier and memory divider, it may not be possible to have both the CPU and memory being topped out simultaneously (within 1 MHz), unless the ratio of the max CPU frequency to max memory frequency is an exact integer or half integer.

For example, refer to the memory_divider table below,
assume bios only has 1:1, 5:6, 2:3 memory_HTT_ratio
NC 2800+, max multiplier = 9, memory_divider available = 9, 11, 13.5
NC 3000+/CH 3200+, max multiplier = 10, memory_divider available = 9, 10, 11, 12, 13.5, 15

So unless knowing what exact memory_divider is needed and plan for the right CPU multiplier, in general, a CPU with higher max multiplier (such as 10) is more flexible (has more memory dividers) in setting up memory modules with various speed than a CPU with lower max multiplier (such as 9).

Note on bios with memory_HTT_ratio of 9:10 and 8:10

The additional offering of 9:10 and 8:10 (same as 4:5) (in DFI bios) to the other memory_HTT_ratios of 5:6, 2:3, 1:2, ... would allow a better chance to finer tune the memory and CPU speed during overclocking.

For example, for max CPU multiplier of 10 (3000+ NC. 3200+ CH), the 9:10 and 4:5 may give an additional memory_divider of 11.5 and 12.5 on top of the 9, 10, 11, 12, 13.5, 15.

So instead of about 20 MHz in tunning step in memory, it may reduce to about 10 MHz in step with the added 9:10 and 8:10 (4:5) ratio.


How to determine memory bus frequency

1. First, the memory_HTT_ratio is determined from the bios setting.

memory_HTT_ratio = ..., 6/5, 5/4, 4/3, 3/2, 2/1, 1/2, 2/3, 3/4, 4/5, 5/6, 9/10, ...
In some bios, only a few can be selected.

With reference to the rated HTT 200 MHz,
- HTT = 200, max memory frequency = 200, memory_HTT_ratio = 1/1
- HTT = 200, max memory frequency = 183, memory_HTT_ratio = 11/12
- HTT = 200, max memory frequency = 180, memory_HTT_ratio = 9/10
- HTT = 200, max memory frequency = 175, memory_HTT_ratio = 7/8
- HTT = 200, max memory frequency = 166, memory_HTT_ratio = 5/6
- HTT = 200, max memory frequency = 160, memory_HTT_ratio = 4/5
- HTT = 200, max memory frequency = 150, memory_HTT_ratio = 3/4
- HTT = 200, max memory frequency = 140, memory_HTT_ratio = 7/10
- HTT = 200, max memory frequency = 133, memory_HTT_ratio = 2/3
- HTT = 200, max memory frequency = 120, memory_HTT_ratio = 3/5
- HTT = 200, max memory frequency = 100, memory_HTT_ratio = 1/2

2. Using the above formula for CPU_memory_divider,
CPU_memory_divider = ceiling(CPU_multiplier / memory_HTT_ratio)
the CPU_memory_divider can be calculated based on only CPU_multiplier and memory_HTT_ratio.
Some common cpu_memory_dividers generated by spreadsheet are listed in this table.

A64_cpu_memory_divider.JPG


In the table, for most motherboards and bios, 1/2 multipliers may not be supported. If that is the case, 1/2 multipliers should be round to the next integer. E.g. 13.5 should become 14.


3. Then the memory_bus_frequency can be determined

memory_bus_frequency = CPU_frequency / CPU_memory_divider
or
memory_bus_frequency = CPU_multiplier x HTT / CPU_memory_divider


Overclocking setting for various bus frequencies

How to choose memory divider and memory_HTT_ratio

Example to setup frequencies for CPU and memory
 
Last edited:
Back