• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

AMD vs Intel

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

Dolk

I once overclocked an Intel
Joined
Mar 3, 2008
Have you ever wanted to ask a question about AMD and Intel to understand what really made them different? Have you ever asked such a question and been given the short answer, "apples vs oranges"?

I'm opening up a small AMA here to allow anyone to ask me questions about the two top cpu manufactures. I will try with my best effort to answer your questions; in a reasonable amount of time. (I tend to get busy time to time)

Who am I? My name is Dolk and I have a Masters in Computer and Electrical Engineering. I've worked on research projects that deal with nano-materials all the way up to RF sensors. I've been following and studying computer architecture and materials science since I started my career.

So anyone can ask me questions like:

"What are the difference between Hyper-threading, and cores?"
"Why is AMD so far behind Intel?"
"Will ARM make a difference?"
"Can you explain how the CPU works and the differences with AMD and Intel?"

Questions already answered:
CPU multiplexer and base clock:
AMD Modules vs Intel Hypter-threading
 
Last edited:
I have a couple, you have the reference clock and the multiplier.
1 how does the multiplier work ? I know it'll multiply the base clock by 10 say but how does it do that and is the any real truth the the old " high base clock gives better performance" I understand how the frequency works with the trough and spike but does attaining it two different ways make any difference if all other busses are the same. IE 200 x 25 VS 250 x 20
 
CPU multiplexer and base clock:

In all digital chips, there is a reference clock to drive the logic of the chip. In the case of CPUs we have the HTT for AMD and the Bclk for Intel. In both cases, these clocks drive the communication bus between all peripherals to and from the CPU. Since this bus is used throughout so many different devices, it would be impossible to drive the signal to high speeds. It is also impossible to create a reference clock that is high speed. The original source of a clock is generated by a crystal that vibrates when electricity is passed through it. Most crystals are around 37MHz, but some can go higher. After this, the frequency has to be bumped up. To do this, the clock is passed through a PLL. This is where the multiplexer gets involved.

imgf0003.png

The base clock will be multiplexed several times throughout its usage. Its first time is after the crystal, and its second... lets say the CPU. In both cases though, they will go through the same process. From the image above, lets say that 40a is our original signal. Its the 37MHz signal produced by a crystal. It will be replicated and adjusted by phase shifting. This means that the clock is offset in time so that it does not align with the original clock (40b-40d). With these signals now offset, we can start creating a new clock based on when the offset clocks rises or falls. In this case, each rising and falling edge of the original clocks (40a-40d) create a rising and falling edge of a new and faster clock (42).

Now lets say that the new signal produced (42) is our reference clock for our CPU (HTT or Bclk). This signal will go through the same process in order to sped up even further, but this time we will use a very large multiplexer so we can dynamically adjust the speed. So lets say we were to take a 200MHz HTT base clock and bring it up to 3000MHz. This means that we need a multiplier of 15 to convert the 200MHz to 3000MHz. Simply translated too: 15 different offset 200MHz signals. Its that simple, well not really. It is still very hard to drive and create such signals but thats a different topic of discussion.

Difference between overclocking with multiplier and base clock:

The difference between overclocking with strictly a multiplier or the base clock is zilch unless we bring in more parameters. Both signals pass through a similar multiplexer, but their usage is key. A good example is the Bclk from Intel. This signal is used throughout all the peripherals on an Intel motherboard since a northbridge is no longer used. This means everything from USB to Ethernet to DDR utilize the base clock in a way to create a reference signal that works at their speed. Since USB and DDR are vastly different in speeds, the base clock must be set in a way that can be used by both at all times. 100MHz is a very clean and easy to use signal that USB and other lower end peripherals can use (most likely USB will divide the frequency down to 50MHz.) Now if the Bclk was increased even by 1MHz, all the peripherals will have to adjust to that speed by either dividing the frequency or increasing throughput. Overclocking USB doesn't exactly work and has never been designed to work. So this means that the Bclk must be static at all times. Thus to increase the speed for the CPU, DDR, and other high speed modules, huge multiplexers are needed drive the signal. These multiplexers can only become so large and so efficient before they start to outweigh cost and performance. This is why you see Intel locking their CPUs at specific frequencies or even maxing the multiplexer. A 6.0GHz signal would require 60 offset clocks to create the signal when the Bclk is at 100MHz. This is an insane request, and its amazing that Intel can even close to this amount. AMD gets a small advantage because they would only need half the signals to create a 6.0GHz signal.

If you have any more questions please make sure to reply to this post.
 
Last edited:
OK that makes sense, thanks. So the "efficiency with multi or fsb is the same regardless. Unless ,let's say the inital clock offset is more efficient than the multiplier off set ? Cleaner for lack of a better phrasing. This could lend again to every CPU is different and responds differently to raising the HTT
 
I'm under the impression that for years, Intel chips (and maybe AMD as well, I'm not sure) haven't been true CISC chips, but rather more RISC based with some sort of onboard converter. Is this done purely as a power/money saving move, or is there something more to it? Also, would they gain anything by manufacturing a RISC chip and competing with ARM on that front?
 
OK that makes sense, thanks. So the "efficiency with multi or fsb is the same regardless. Unless ,let's say the inital clock offset is more efficient than the multiplier off set ? Cleaner for lack of a better phrasing. This could lend again to every CPU is different and responds differently to raising the HTT

Not necessarily. This would only be for very different generation CPUs. There are some cases where the Bclk for Intel was better depending on what Mobo you had. This was quickly resolved with the next generation though. I believe this was the first bclk generation. In most cases I would trust the clock produced in the CPU as it will have better control.
 
Here's the one everyone wants an answer to, well maybe not everyone. Why is it that an 8core AMD FX can't compete with 4core HT Intel CPUs? I think I understand why in my head but not sure I could write it out in less than 20 pages of non-sense.
 
Here's the one everyone wants an answer to, well maybe not everyone. Why is it that an 8core AMD FX can't compete with 4core HT Intel CPUs? I think I understand why in my head but not sure I could write it out in less than 20 pages of non-sense.

Excellent question! I'm subscribed for sure.
 
I'm under the impression that for years, Intel chips (and maybe AMD as well, I'm not sure) haven't been true CISC chips, but rather more RISC based with some sort of onboard converter. Is this done purely as a power/money saving move, or is there something more to it? Also, would they gain anything by manufacturing a RISC chip and competing with ARM on that front?

Freak, I'm re-reading about this information today and tomorrow. I want to answer this question after a refresh. Its been a long time since I've read about instruction algorithms in CPUs.
 
Freak, I'm re-reading about this information today and tomorrow. I want to answer this question after a refresh. Its been a long time since I've read about instruction algorithms in CPUs.

Don't stress out about it, I was just curious about it ever since I first heard about it.
 
Here's the one everyone wants an answer to, well maybe not everyone. Why is it that an 8core AMD FX can't compete with 4core HT Intel CPUs? I think I understand why in my head but not sure I could write it out in less than 20 pages of non-sense.

AMD Modules vs Intel Hypter-threading

There is a long and short version to this story and how the two companies have ended up at this point. I'll first give an overview and than let you guys ask more questions.


haswell-1.png
Haswell front end algorithm

bulldozer-2.png
bulldozer-3.png
Bulldozer front end algorithm

In the beginning, there was One. The One Core ruled over all, The One core fetched and executed all instructions, The One Core handled all interrupts, and in the year Twenty O'five, the reign of The One Core was ended. With the introduction of the AMD X2 in the middle months of the good year; a new start to the CPU era had begun. Multi-core processors began to rule the CPU world and so forth brought new efforts in software and hardware coding. The era of One is over, the era of Many is now.

AMD was the first company to implement multi-core desktop processors in May 2005. Intel developed the first multi-threaded desktop CPU in 2002 with their Xeon and P4 CPUs. Why, and what are these different approaches? And why didn't AMD develop multi-threading as well? I'll first go over multi-threading and multi-core technology, and then why AMD and Intel went different ways.

Around the same time as CPUs transitioned between 32bit to 64bit technology, the software world was ahead of the hardware world. This meant that software was able to fully exploit the hardware it was ran on. It was also at this point that single core CPUs started to reach their max potential. Both Intel and AMD were pushing their CPUs to their maximum potential for the single-threaded world. It was time to grow. The multi-core CPU was a last ditch plan to keep the x86 processor afloat.

While AMD began to ship dual core CPUs, Intel started to look into multi-threading. Multi-threading technology is the utilization of a single processing core, to compute multiple instructions in a cycle. Multi-threading technology does not mean that there are more cores to compute instructions, rather its the same core but masked. Say you have three instructions with different priorities to complete. The CPU will decide to run all three instructions on the same core, but prioritize each. The instructions will be switched back and forth between the processing unit to be computed around the same time, but the highest priority instruction will finish first. In order to properly implement multi-threading technology, the buffer size of your compute nodes needs to be very large. There is also a high demand in accuracy in your fetch and decode cycles. These modules need to be able to accurately calculate which instructions need to be processed before another, and which will be offloaded to multi-threaded branches.

Intel's Hyper-Threading technology is a very well done multi-threading technology. Its front end (latest pictured above) is one of the reasons as to why Intel is doing well with this technology. The branch prediction, and the decode ability of the front end can prioritize instructions very well so that it can off load to a hyper-threaded branch easily and efficiently.

If Hyper-Threading is the software side of multiprocessing, multi-core is the hardware side. Multi-core CPUs are used to exploit multi-threaded computations in order to distribute load and increase efficiency. Let me repeat that but clearly: Multi-core CPUs are useless without the ability of the software enabling it to be distributed amongst many cores. Now this doesn't mean that a single-threaded program will only target the first core in a multi-core CPU. The CPU will distribute the single-threaded program amongst its cores depending on its architecture. A multi-core CPU is only able to work at its peak efficiency when the software is able to exploit the CPU architecture. This means that the software must recognize the number of cores, and how to split up its code amongst these CPUs.

This is where AMD's Bulldozer comes into play: from a hardware perspective, how do we distribute instructions efficiently utilizing cores? The Bulldozer core is a novel approach to distributing instructions inside an execution core. Bulldozer has two integer cores shared between a front end and a back end. Along with two integer cores, each Bulldozer core also contains a single FP core. Since there are two integer cores, a Bulldozer core is considered to be two cores.

Now, how come AMD and Intel went separate ways? Intel has already been mentioned. They have a stronger side on the algorithm of the CPU and have the most advanced x86 front ends. This means their CPUs can artificially identify the priority of an instruction and distribute it amongst cores and threading. A very brutal combination when it is pitted against AMD CPUs. AMD however has not advanced as far as Intel on the algorithm side. Their front end (pictured above) is only as good an Intel Pentium M processor. AMD has gone the way of the Pentium 4, and pushed speed of single-threaded processing to its limits. This doesn’t mean that the AMD got it wrong or that they are completely out.

Brass tax, what is really the defining difference that separates AMD and Intel? It is their different style in handling multi-threaded instructions. This is not due to Intel’s Hyper-Threading or AMD’s Bulldozer cores. Its the front end of the CPU that makes the difference. Intel has invested more money than any company to create a very intelligent algorithm to distinguish priority of fetched instructions. AMD’s prediction algorithm was seen in Pentium M era. Since this style is a race against space, the CPU must use brawn rather than brain to finish an instruction.
 
How does transistor count play a role? We say a massive cut transistors with AMD when moving to 32nm althoug there are some amd 32nm chips based off phenom cores Ive noticed similar processing power but at much different clock speeds say Llano 8350 vs fx 4300 the llano at 3.6ghz benches similar to fx core at 4ghz. Can we have some input on how transistors are used in the cph for memory nb ht and such??
 
Thanks for doing this Dolk, very informative.

A side question -- Is there anything that AMD does better in its CPUs than Intel besides using more power? I guess an argument can be made for APUs and their graphics capability, but Intel's new iGPU is a strong contender for that as well. I realize that financially it is a David vs Goliath battle here, but I figure AMD would have something that they can do better versus the Intel counterpart. Say for example an 8core AMD vs an 8core Intel (Xeonn/5960x w/ HT turned off to make things equal at core/thread #s).
 
Thanks for doing this Dolk, very informative.

A side question -- Is there anything that AMD does better in its CPUs than Intel besides using more power? I guess an argument can be made for APUs and their graphics capability, but Intel's new iGPU is a strong contender for that as well. I realize that financially it is a David vs Goliath battle here, but I figure AMD would have something that they can do better versus the Intel counterpart. Say for example an 8core AMD vs an 8core Intel (Xeonn/5960x w/ HT turned off to make things equal at core/thread #s).

I can answer this rather quickly before I move onto the other two questions I'm trying to finish up as well.

As of right now, AMD is pretty far behind Intel. Back in the pre-Sandybridge era when AMD released Bulldozer, AMD had a decent hold over the multi-core/multi-threaded application world. In a lot of cases (and pretty true still) high computation data centers would choose AMD over Intel. Since AMD has a higher amount of cores, more data can be processed. If your program is core sensitive, AMD had a huge win over Intel. Now this was a couple years ago. Since AMD hasn't refreshed their core, Intel has started to make headway on the server side once more. I can say that server companies have not completely dropped AMD, its mostly because AMD has not refreshed their product and companies want the latest when they buy a new server.

Surprising enough, if you look at the history of the Top500 you will find that AMD was used to run the top supercomputers for several years. PowerPC and Xeon are now making headway since AMD hasn't done much with Opteron.
 
Surprising enough, if you look at the history of the Top500 you will find that AMD was used to run the top supercomputers for several years. PowerPC and Xeon are now making headway since AMD hasn't done much with Opteron.

AMD were running in supercomputers not because their performance per core was higher or total computation pperformance per CPU was higher but because it was possible to put more cores ( CPU sockets ) on a single board. Simply more cores on the same space were giving that advantage in calculations. Not to mention much lower cost per core. Other thing is that earlier AMD had lower delay on data transfer between cores so they're actually good for many small operations at the same time.

Before Core 2 generation, Intel was better choice as it was more reliable. After it was better choice as it was offering higher performance in most operations. Saying about reliability I mean quality of motherboards, availability of spare parts etc. which was always lower for AMD.

Since I'm working in IT ( 12 years or something ) I have never seen that AMD had more than ~15% market shares on EU market in business products. AMD was always related to home entertainment and people actually see it like that all the time.
Right now I don't think anyone will choose AMD over Intel for work. AMD business department is like dead. The same in servers and laptops.
Personally I don't know single distributor in central EU which is offering AMD based servers from stock and I work in distribution for couple of years. In Poland I haven't seen AMD based server for like 6-7 years. Simply people don't trust AMD to buy them for business, not to mention lower performance etc. comparing to Intel.
The biggest server manufacturers don't have anything based on AMD in mass production. Couple of years ago IBM was keeping 1 ... literally 1 AMD server series because of some single operations which were running faster on AMD and that was when first Phenoms were on the market.
 
Last edited:
I run cfd programs, why can't any of my software use the hyper threads on my intel cpu's but can use all 8 cores of fx?
how the hell do they fit over a billion tranzisters on a chip?? that's a billion somethings, I can hold in my hand!!!!
folding at home uses our gpu's to compute stuff, I just bought some very high dollar software that will be running on quadro 6000m's, will this type of software we comeing to the home pc or is the cost just to much?
holly mother of god, I could fly with creflo dollar for the cost of this stuff over the next ten years!!!!
 
AH NVM, I just did some quick google to find some interesting facts of my question.

I had thought this to be a good read.
http://www.xbitlabs.com/news/cpu/di...x_AMD_Engineer_Explains_Bulldozer_Fiasco.html

Found it funny that AMD found 800 million less transistors per cpu.

Would explain why my Llano 32nm technology can produce bigger bang for the count. Talking Phenom II quad still a better cpu over FX quad. The only offering FX-8 core has is actually having extra 4 cores.

Step forward.... Or step backward?

FX module design new.... or based off old non working designs...??

Meh, doesn't matter. Per core Intel has more transistors, utilizes instructions better and has better IMC including floating point.....

I really don't see why we would compare a turnip to a walnut.
 
Back