• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

Performance/clock and performance/watt improvements

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

Firestrider

Member
Joined
Dec 9, 2003
Location
Orlando, FL
Coming from the very first processor made by Intel and AMD how much improvements have they made until now in performance/clock and performance/watt if you consider performance in floating-point operations and integer operations?

What processor has the best performance/watt and performance/clock considering all different companies (AMD, VIA, Intel, ARM, HP, IBM, etc) and performance levels?
 
Investigating this will take me months if I spent hours every day, if not years... honestly :)

Int/Fp operations will have to be tested by an application and thats a contentious point seen as they all differ and may pose a bottleneck in separate parts of the CPU depending on their optimizations.

I suppose the first thing you need to do is make an application that is linearly keeping the whole pipeline working full without any stalls, for each CPU, and then target Int or Fp units individually. The only way to achieve that is to eliminate the RAM/HDD continuous access.
 
From the very first processor to Core i7 is probably in the area of millions of times.

The best CPU atm is clearly Core I7, in everything. It has the best performance : clock, the best performance : watt, and just the best performance period (price : performance is debatable where all 8 threads can't be used). More specialty exotic type chips might excel in niche type applications, but overall Nehalem rocks everything.

Performance : Watt is everything atm and Intel is pushing it hard. (i7 has something like 30% better performance : clock while maintaining 20% better performance : watt over penryn).

The US government has just ordered a ~20 peta flop supercomputer from IBM that is supposed to be up and running by 2012. I forgot where I saw it but they said they were projecting that level of performance in the wattage area of what the roadrunner and Jaguar are using now (so the same energy output for a 20x more powerful super, 20x better performance : watt) 4 years after we broke 1 petaflop. Which is moving even faster than moore's law.
 
Just to make something clear - the first 3 paragraphs of what Shiggity said, I don't even remotely agree with, specifically as we're talking for every CPU out in terms of Int/Fp, not just Desktop or what concerns Desktop applications. Even just for that, I don't agree with those statements.

Like I said, there is no one fixed generic answer for what you've asked and anything as such is totally misleading and incorrect. If you replaced the 100/130W i920s with a 70W, retail, or used a heavy multi-threaded optimized application, you'd completely alter the standings in any test. The same can be said for quite a few CPUs.

Performance/clock in pure Int workloads is dominated by the Nehalem architecture within x86 CPUs. I'd really like to see someone backup and explain their answers when comparing with non-x86 CPUs and with Fp performance gauging though... :)
 
Just to make something clear - the first 3 paragraphs of what Shiggity said, I don't even remotely agree with, specifically as we're talking for every CPU out in terms of Int/Fp, not just Desktop or what concerns Desktop applications. Even just for that, I don't agree with those statements.

Like I said, there is no one fixed generic answer for what you've asked and anything as such is totally misleading and incorrect. If you replaced the 100/130W i920s with a 70W, retail, or used a heavy multi-threaded optimized application, you'd completely alter the standings in any test. The same can be said for quite a few CPUs.

Performance/clock in pure Int workloads is dominated by the Nehalem architecture within x86 CPUs. I'd really like to see someone backup and explain their answers when comparing with non-x86 CPUs and with Fp performance gauging though... :)

From what I've heard the best processor for heavy multi-threaded int workloads is the UltraSPARC T1. Nehalem is probably the best for single-threaded int workloads. For heavy fp workloads I would think an ATI/Nvidia GPU (with limitations) or IBM Cell processor would be the best.
 
Just to make something clear - the first 3 paragraphs of what Shiggity said, I don't even remotely agree with, specifically as we're talking for every CPU out in terms of Int/Fp, not just Desktop or what concerns Desktop applications. Even just for that, I don't agree with those statements.

Like I said, there is no one fixed generic answer for what you've asked and anything as such is totally misleading and incorrect. If you replaced the 100/130W i920s with a 70W, retail, or used a heavy multi-threaded optimized application, you'd completely alter the standings in any test. The same can be said for quite a few CPUs.

Performance/clock in pure Int workloads is dominated by the Nehalem architecture within x86 CPUs. I'd really like to see someone backup and explain their answers when comparing with non-x86 CPUs and with Fp performance gauging though... :)

Instead of simply disagreeing, please provide some evidence to support your claims. Show me a chip that beats the 8 core Xeon Nehalem that is anywhere even close in price. If you guys are just totally disregarding price count me out of the discussion. Right now opterons are in the biggest and fastest super computers and the fastest ones will soon be nehalems, why wouldn't they use a different CPU if it was more efficient for the performance?

This discussion is a little silly anyways when GPU's are higher than every CPU anyways. The top GPU's get ~ 1TFLOP of performance for ~300W, nothing beats that, with new GPU's expected to do ~2TFLOPs for ~300W soon (Larrabee, GT300, RV800).

I'm watching the Zii processor atm, which is just a really more powerful Cell.
 
Last edited:
Instead of simply disagreeing, please provide some evidence to support your claims. Show me a chip that beats the 8 core Xeon Nehalem that is anywhere even close in price. If you guys are just totally disregarding price count me out of the discussion. Right now opterons are in the biggest and fastest super computers and the fastest ones will soon be nehalems, why wouldn't they use a different CPU if it was more efficient for the performance?

This discussion is a little silly anyways when GPU's are higher than every CPU anyways. The top GPU's get ~ 1TFLOP of performance for ~300W, nothing beats that, with new GPU's expected to do ~2TFLOPs for ~300W soon (Larrabee, GT300, RV800).

I'm watching the Zii processor atm, which is just a really more powerful Cell.

The Cell can reach 256 GFLOPS with a power consumption of 60-80 watts: http://www.stanford.edu/class/cs379a/presentations/Max2.ppt

The RV770 (4850 in particular) can reach 1 TFLOPS with a power consumption of 110 watts: http://www.rage3d.com/print.php?article=/reviews/video/atirv770/architecture but you have to factor in the CPU overhead to "drive" one of these.

Core i7 965 can reach 51.2 GFLOPS with a TDP of 130W: http://www.intel.com/support/processors/sb/CS-023143.htm , http://download.intel.com/design/processor/datashts/320834.pdf

I'm not sure what metric you use to measure int performance, but SPECint is probably a good benchmark.
 
The Cell can reach 256 GFLOPS with a power consumption of 60-80 watts: http://www.stanford.edu/class/cs379a/presentations/Max2.ppt

"Given the right task" - Just like GPU's destroy CPU's given the right task.

I also wonder how much these cost.

The RV770 (4850 in particular) can reach 1 TFLOPS with a power consumption of 110 watts: http://www.rage3d.com/print.php?arti...0/architecture but you have to factor in the CPU overhead to "drive" one of these.
CPU overhead!

Ouch 999$ a piece.


Intel's 80 core CPU - http://news.cnet.com/Intel-shows-off-80-core-processor/2100-1006_3-6158181.html
At 3.16GHz and with 0.95 volts applied to the processor, it can hit 1 teraflop of performance while consuming 62 watts of power.
Creative Zii - 2 Arm CPUs + 24PPE's. Suped up cell architecture with more SoC features and more emphasis on 'green'. Each chip is 10GFLOPs and focuses on power consumption efficiency and size. www.zii.com
 
Back