• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

Why increase voltage? What is the proverbial OC wall? Read for some answers!

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

icesaber

Member
Joined
May 22, 2004
Location
Michigan
A lot of people have the abstract idea that a CPU is like an electric motor, and to make it go faster you have to up the voltage. I want to clear up the myth, for those who don't already know. I'd like to touch on why CPUs generally have a maximum point they will OC to before they just won't do any more as well. Also, anybody who can put it more eloquently may obviously feel free to correct me.

The truth of the matter is that the amount of voltage needed for a CPU to operate is based on tolerances for binary values. In a CPU or any transistor for that matter, there isn't really a 0 or 1 value; there are only low and high voltage. The tolerances are what I call the minimum point at which the transistor recognizes the voltage as a 1 (I will refer to this as "low tolerance"), as well as the maximum point where a 0 is recognized (high tolerance). It can be helpful to look at this pair of tolerances as a band, since any value that falls between the two may produce an unknown output (varies by chip). As a side note, some chips use "low assertion," which simply means that a low voltage is read as a 1 instead of vice-versa.
When you overclock a CPU, less voltage is available per cycle to the transistors in the operational units. There may no longer be enough time for the voltage to propagate across those microscopic wires between transistors for the necessary voltage to accumulate, we could get erratic responses from the CPU as some of the 1 bits actually become unknown values within the band or even 0's if the band is narrow enough to allow it, since they're below the high tolerance now.
If we increase the voltage, we increase the current, thereby decreasing the amount of time required for that voltage to build up at the transistor and produce a 1.
Realistically, the tolerances are the difference between a high-end and low-end chip. Let's say you have a 3.0GHz P4 and a 3.4GHz one. If they have the same core, it is very possible that the only differences are tolerances. The transistors in the 3.4 may have a reduced low tolerance so that less voltage is required to produce the correct output without needing to increase voltage. So in reality, a faster chip does not necessarily have a "higher quality" core, but just a downward-shifted tolerance band.

The reason why we can only get so much performance out of the chip is because of the system of overhead used in the pipelining process. Just like any assembly line, the pipeline in a CPU can only perform each step as fast as the slowest stage in the pipeline. For example, if the pipeline stages have the times 8-2-2-3-2-10, we would have to operate every step at 10 or above so all following instructions in the pipeline have time to complete the last stage. Most chips are actually set up to have a safety region, a kind of overhead. In the case of the above example, we may actually make the time 15 or even 20 just to make sure everything completes correctly (also varies by speed of CPU, see 3.0-3.4 comparison, same core).
If we overclock it too far, we may actually make the time allowed less than that minimum, like 9. If the amount of time gets too low, instructions may be unable to complete one or more stages of the pipeline, producing erratic and almost always unbootable results. This is one of the proverbial walls to overclocking a CPU. The other primary walls are heat and electron migration.
Heat is an obvious one, because the more heat is produced the more voltage is required to perform the same task (reduced electrical efficiency). Realistically, we may get better performance out of a chip at a slightly slower speed. For example, my Athlon XP mobile Barton 2600+ may put out better synthetic benchmark results at 2.6GHz, but it boots faster, generally plays games smoother and moves around the OS better at 2.5GHz, solely based on the 5 degrees Celsius difference.
Electron migration is the point at which a chip will most likely die. It's when so much voltage is being put through a wire that some of it leaks off to a neighboring wire, producing erratic results. You can compare it to a case of a river during a heavy rainstorm; Once the water level overruns the edges, it may actually erode fresh streams and offshoots from the original river, which will keep flowing after the storm has passed. If this happens to the CPU, the chip is almost guaranteed finished. You may as well make it a new hood ornament. Since heat is directly related to molecular motion, it's plain to see that higher operating temperatures can easily increase the risk of electron migration. The simple solution to this is better cooling. Bear in mind that electron migration really occurs no matter what because of temperatures and the nature of electrons, but that's why the tolerances exist. That way, a certain amount can occur without producing unexpected results. There is no sure way to tell when this has killed your chip, it is just one possible way that a CPU can burn out (although it's quite common). So if your CPU suddenly burns out, this may be the culprit.

There's obviously a lot more to it than this, but questions and comments are still welcome as always.

*edited 2:09PM Friday 31 Dec 2004*
- inconsistencies: added "tolerance band" abstraction, tx to jbloudg20
*edited 6:23PM Saturday 01 Jan 2005*
- inconsistencies: details to electron migration and low assertion, tx to Captain Newbie
 
Last edited:
Nice read icesaber. as always:)
I suppose this would pertain to gpu's as well? It would explain the performance drop from raising the core too high sometimes.Although a lack of voltage could be a reason there too.
 
As far as the threshold, there isnt a clear this is high, this is low voltage point. Usually there is a middle where there is no value. For a typical CMOS chip between 0 and 1.5 volts is low, and from 3.5 to 5 is a high. In this middle "gray area" the state is uncertain. A given manufacturer cannot guarantee how their chip will react in this condition.
 
jbloudg20 said:
As far as the threshold, there isnt a clear this is high, this is low voltage point. Usually there is a middle where there is no value. For a typical CMOS chip between 0 and 1.5 volts is low, and from 3.5 to 5 is a high. In this middle "gray area" the state is uncertain. A given manufacturer cannot guarantee how their chip will react in this condition.

Tx for the input, jbloudg20, much appreciated. I've updated the post accordingly, let me know if that seems more accurate. Accuracy is my goal here.
And tx to everyone else for the support. if this does get it, it'll have been my first sticky.
I realize I don't post much, although I've been a member for some time... I just thought it was time to at least try to make a contribution.
 
You have essentially boiled down Hitechjb1's hundreds if not thousands of posts into a nice summary.

Electron migration is one of several killers. Carelessness in overclocking would contribute to more processor deaths IMO since people take processors in BIG steps with improper cooling. ( = dead silicon) You may say it's the same thing; really, it is as the electron migration is a result of the huge steps.

I would be inclined to add that electron migration rates and temperatures are directly related, since the definition of temperature is the root mean square velocity of the substance's particles. In essence, they're really the same thing with the same results. This (as you have explained) explains why most air coolers won't go as far as liquid or phase-change coolers. Stuff bounces around at a slightly slower velocity and as such is less inclined to hop off the wire it's supposed to be.

It should be noted that electromigration happens at all voltages and all temperatures greater than zero degrees kelvin (absolute zero) because of quantum uncertainty. However, 90%-99% of the time, everything is where it's supposed to be, so we can throw the quantum argument out.

Some architectures actually use a high voltage as a zero, and a low voltage as a one (low-assertion), but the end result is the same. Increasing voltage gives more "potential" as it were to charge the transistors--essentially as you put it. Just being picky.

From this we can conclude that there are actually two maximum frequency/voltage points: One that is variable with temperature/electromigration (V sub rms), and another that is purely architectural. Regrettably, I don't think any of us have the resources (ergo, processor blueprints and an electrical engineering degree) to determine what either or both points are except through trial and error.

Coolness. Edit, reformat, and sticky it in a prominent place.
 
This clears up So much for me. This thread NEEDS to be a sticky, since i know that most people on the forums don't know why they have to increase the voltage on an o/c. Thanks for a very informative post icesaber, and props to Captain Newbie for his response as well.
 
In general, I believe that the scientist-overclocker is a dying breed, being replaced ever so slowly but surely by the enthusiast overclocker, who doesn't really care about such things as long as it is OMFG! my PC r0x0rs teh big one!!1!11!11!!!!1one. Overclocking is really all science...
 
Great post!
So overclocking and running a cpu hot can hurt real world performance?thats something i didnt know.
 
I don't have all the technical info behind it, but in my experience, overclocking too much while the heat is too high can actually reduce performance. I know it definitely has something to do with reduced efficiency of the logic circuits at higher temperatures, most likely the cache memory since the cache is the biggest piece of latency in the pipeline architecture (to my knowledge) and would therefore produce the most noticeable difference... hence better performance from extreme cooling. I'm running air-cooled, so I'm quite sure that this chip can handle 2.6GHz and beyond beautifully if it were watercooled. No cash for that for now.

Actually, my OC is stable as long as the vcore doesn't go below 1.7v, but my NF7 seems to droop... 1.75 is the bios setting I use, when the monitor actually reads 1.72. Not sure if I should trust that reading, seems dubious.
 
How cold can you get your room? Maybe try a night of benchies to see if a cold night can help you find out if the cold might negate the performance drop?
 
With the ambient case temp at 19 degrees Celsius, I still get the performance hit. That's all the windows open on a snowy sub-freezing day.
 
Hey this might explain why a couple of times ive got better bench results (over 50 points or so) from a lowered voltage at the same speed.
 
Its an excellent, iformative article.

I ran into alot of the same issues when I was overclocking, now I'm stable, and fast, and running fancy free :D
 
Back