Joeteck
Retired
- Joined
- Oct 5, 2001
- Location
- Long Island
I copied and pasted this information from the slow website. I'm not sure about the data's integrity, but still interesting reading. I read alot about the Conroe, and would have switched back to Intel based on it's performance. If you read the entire thing, you will be stunned as well. - Joeteck
Conroe performance claim being busted
Recall Intel's Mooly Eden said Con-roe will be 20% faster than AMD's future chips without even knowing AMD's plans? During the Spring 2006 IDF, Intel setup a Conroe and an Athlon 64 box, then directed benchmarkers such as Anand to push buttons*, but peaking into Windows device manager of the alleged Conroe wasn't allowed.
During the IDF, I emailed various Intel execs, AMD execs and Anand, I pointed out that such a pre-arranged blackbox Intel setup against AMD was unfair and challenged Intel to lend the Conroe box to Anand for a real drill. However, Intel dared not to answer such a simple challenge based on the rules of fair competition. The INQ sharply criticised this kind of guerilla benchmarketing.
Now, for the very first time, someone actually got hold of a Conroe chip in their own lab and did some tests. It was a 2.4GHZ Conroe CPU-Z against an Athlon 64 overclocked to 2.8GHZ. The overclocked Athlon 64 had a 2.8/2.4 -1 = 16.7% clockspeed advantage.
The following results were obtained by running 32 bit ScienceMark binaries optimized for Intel Pentium:
Molecular Dynamics
A64: 1872.68
Conroe : 2133.38 -- 14% faster
Primordia (Energy calculations for 1 atom)
Athlon64: 1506.83 -- 10% faster
Conroe: 1365.85
Cryptography
Athlon64: 1345.05 -- 26.3% faster
Conroe: 1065.59
STREAM
Athlon64: 1512.55 -- 21.7% faster
Conroe: 1242.94
The above results were for an Athlon overclocked to 2.8GHZ and a Conroe at 2.4GHZ, with the Athlon having a 16.7% clockspeed advantage. For a direct comparision at the same clockspeed, we normalize the Conroe scores by taking into account the frequency difference. Assuming the best scenario in which Conroe scores scale linearly with clock speed, we multiply the Conroe scores by a factor of 2.8/2.4. Thus, with a 2.8GHZ Conroe, we would have
Molecular Dynamics
Athlon 64 2.8GHZ: 1872.68
Conroe 2.8GHZ : 2133.38 * 2.8/2.4 = 2489 -- 32.9% faster
Primordia (Atom)
Athlon64 2.8GHZ: 1506.83
Conroe 2.8GHZ: 1365.85 * 2.8/2.4 = 1593.49 -- 5.7% faster
Cryptography
Athlon64 2.8GHZ: 1345.05 -- 8.2% faster
Conroe 2.8GHZ: 1065.59 * 2.8/2.4
STREAM *
Athlon64 2.8GHZ: 1512.55 -- 4.3% faster
Conroe 2.8GHZ: 1242.94 * 2.8/2.4 = 1450
ScienceMark is a strictly CPU/memory test, it doesn't involve video or disk I/O, it is basically a raw speed test. The ScienceMark is freely available from http://www.sciencemark.org/ for both Windows XP and Windows XP x64.
However, the above results showed a violent CPU performance fluctuation for Conroe, from it being 32% faster to being 8% slower. How can this be explained?
The cause of the Conroe performance fluctuations can't be the types of computation involved. We notice that MolDyn is a floating point computation while the Cipher is an integer computation. However, both MolDyn and Primordia are floating point calaculations on quantum mechanical properties of matter, yet, the Primodia showed a 27% relative performance drop.
As we look deeper in the ScienceMark, we notice that in the default MolDyn benchmark setting, there are only 4 cells with a simple cubic lattice, no more than 32 molecules are involved, about 2MB to 4MB memory is needed. The Primodia calculation for a single Ag (silver) atom with 47 electrons needs just a bit more memory than MolDyn. However, both the Cipher and STREAM tests involve a lot more than 4MB.
The reason why Conroe did so well in the MolDyn test is simple: Conroe has a huge 4MB of unified cache, for such single threaded tests that can fit in 4MB*, Conroe can just run off the cache with very high speed -- another cheap gimmick at the expense of very large die size.
However, once you go over the 4MB limit, Conroe is slower than Athlon 64 at the same clock. Both the Cryptography and STREM tests use a lot more than 4MB, larger than Conroe's 4MB cache, and Conroe immediately falls below Athlon 64 on the performance curve.
I can bet on this: if one increases the number of cells in the MolDyn test to 9, thus increases the working set to larger than 4MB, Conroe will perform worse than Athlon 64 at the same clockspeed.
The conclusion is: clock for clock, Athlon 64 will beat Conroe in real application environments that require a working set of larger than 4MB, or in other words, larger than Conroe's 4MB cache. This means in any real multi-tasking or server environment the Core architecture will be an underdog. Even worse, for Intel's shared cache architecture, cache thrashing is a distinct possibility under heavy loads.
Conroe performance claim being busted
Recall Intel's Mooly Eden said Con-roe will be 20% faster than AMD's future chips without even knowing AMD's plans? During the Spring 2006 IDF, Intel setup a Conroe and an Athlon 64 box, then directed benchmarkers such as Anand to push buttons*, but peaking into Windows device manager of the alleged Conroe wasn't allowed.
During the IDF, I emailed various Intel execs, AMD execs and Anand, I pointed out that such a pre-arranged blackbox Intel setup against AMD was unfair and challenged Intel to lend the Conroe box to Anand for a real drill. However, Intel dared not to answer such a simple challenge based on the rules of fair competition. The INQ sharply criticised this kind of guerilla benchmarketing.
Now, for the very first time, someone actually got hold of a Conroe chip in their own lab and did some tests. It was a 2.4GHZ Conroe CPU-Z against an Athlon 64 overclocked to 2.8GHZ. The overclocked Athlon 64 had a 2.8/2.4 -1 = 16.7% clockspeed advantage.
The following results were obtained by running 32 bit ScienceMark binaries optimized for Intel Pentium:
Molecular Dynamics
A64: 1872.68
Conroe : 2133.38 -- 14% faster
Primordia (Energy calculations for 1 atom)
Athlon64: 1506.83 -- 10% faster
Conroe: 1365.85
Cryptography
Athlon64: 1345.05 -- 26.3% faster
Conroe: 1065.59
STREAM
Athlon64: 1512.55 -- 21.7% faster
Conroe: 1242.94
The above results were for an Athlon overclocked to 2.8GHZ and a Conroe at 2.4GHZ, with the Athlon having a 16.7% clockspeed advantage. For a direct comparision at the same clockspeed, we normalize the Conroe scores by taking into account the frequency difference. Assuming the best scenario in which Conroe scores scale linearly with clock speed, we multiply the Conroe scores by a factor of 2.8/2.4. Thus, with a 2.8GHZ Conroe, we would have
Molecular Dynamics
Athlon 64 2.8GHZ: 1872.68
Conroe 2.8GHZ : 2133.38 * 2.8/2.4 = 2489 -- 32.9% faster
Primordia (Atom)
Athlon64 2.8GHZ: 1506.83
Conroe 2.8GHZ: 1365.85 * 2.8/2.4 = 1593.49 -- 5.7% faster
Cryptography
Athlon64 2.8GHZ: 1345.05 -- 8.2% faster
Conroe 2.8GHZ: 1065.59 * 2.8/2.4
STREAM *
Athlon64 2.8GHZ: 1512.55 -- 4.3% faster
Conroe 2.8GHZ: 1242.94 * 2.8/2.4 = 1450
ScienceMark is a strictly CPU/memory test, it doesn't involve video or disk I/O, it is basically a raw speed test. The ScienceMark is freely available from http://www.sciencemark.org/ for both Windows XP and Windows XP x64.
However, the above results showed a violent CPU performance fluctuation for Conroe, from it being 32% faster to being 8% slower. How can this be explained?
The cause of the Conroe performance fluctuations can't be the types of computation involved. We notice that MolDyn is a floating point computation while the Cipher is an integer computation. However, both MolDyn and Primordia are floating point calaculations on quantum mechanical properties of matter, yet, the Primodia showed a 27% relative performance drop.
As we look deeper in the ScienceMark, we notice that in the default MolDyn benchmark setting, there are only 4 cells with a simple cubic lattice, no more than 32 molecules are involved, about 2MB to 4MB memory is needed. The Primodia calculation for a single Ag (silver) atom with 47 electrons needs just a bit more memory than MolDyn. However, both the Cipher and STREAM tests involve a lot more than 4MB.
The reason why Conroe did so well in the MolDyn test is simple: Conroe has a huge 4MB of unified cache, for such single threaded tests that can fit in 4MB*, Conroe can just run off the cache with very high speed -- another cheap gimmick at the expense of very large die size.
However, once you go over the 4MB limit, Conroe is slower than Athlon 64 at the same clock. Both the Cryptography and STREM tests use a lot more than 4MB, larger than Conroe's 4MB cache, and Conroe immediately falls below Athlon 64 on the performance curve.
I can bet on this: if one increases the number of cells in the MolDyn test to 9, thus increases the working set to larger than 4MB, Conroe will perform worse than Athlon 64 at the same clockspeed.
The conclusion is: clock for clock, Athlon 64 will beat Conroe in real application environments that require a working set of larger than 4MB, or in other words, larger than Conroe's 4MB cache. This means in any real multi-tasking or server environment the Core architecture will be an underdog. Even worse, for Intel's shared cache architecture, cache thrashing is a distinct possibility under heavy loads.