RISC vs CISC. Most processors run on Reduced Instruction Sets today. Our beloved X86 is one of those that use a Complex to Reduced ISA (instruction set architecture) pre-converter. I believe from history that Cyrix started this back with it's 486 processors and later NexGen, AMD and Intel followed with the processes. This involves taking an instruction that preforms multple tasks such as (read data, add immediate value, store to mem or reg then bump the address pointer). This can be converted to sever simpler and smaller codes that perform each operation seperately. But would'nt that take longer you might ask? Actually no due to the pipelining now used and as SuperNade pointed out and in the articles, multiple processes are in action at the same time. In effect long before the add takes place, the data has been read in while the previous instruction(s) excuted. Once the add takes place then the next instruction in the pipeline can start, the data gets stored back and the address bumped. Exceptions can occur when the same address or register is hit by different operations. This would cause the pipeline ordering to adjust for inlining to prevent corruption. One thing that helps is with compilers that perform code optimization. Optomized code will see the same memory being hit and convert the operation to use registers instead. Now once loaded to a register the two instructions will load into the pipeline for back to back excution resulting in one read to ops (operations) and one write and bump.
Also note the two L1 caches on todays processors. One L1 is for code the other L1 is for data. This allows data to be streamed in and code to handled seperately. With the exception of branches/jumps and calls, code executes linearly where data is on mostly blocks or single datems. The bottleneck occurs in the L2 when the cache controller has to deal with pulling data from different parts of memory or brached. Instruction prefetch helps by signalling the request before that actual instruction hits the excute stage. By design the processor "can" work on floating point ops, preprocess code and excute other code while the data is brought in.
Not to discredit ARST but Please note that the ARSTechnica looks official but it's more oppinionated than factual. Most of the data is correct but the authors appear to be politcal and stray from the subject. Example on OC vs EM or Electro-migration. I will say that just overclocking a CPU and even some overvolting will not "generally"(cautious word) hurt a properly cooled CPU. What the article writer seems to miss is that EM mayl occur at certain voltage levels when electons start hopping over gates. While I have not seen this theory proven, it's been understood as a possible explanation for SNDS and Slow CPU death. A proven annomaly is tunneling where electrons burn through the structure of a transistor turning it on or off permanantly. This can happen at low temeratured since it is a small area that is affected. While subzero cooling can reduce the occurance, it may still happen in an isolated area where the heat is just one little tiny spec on the die and just simply can not dissapate fast enough. Why cooling helps is due to physics and the lowering of the electron fields of elements. This allows traveling electrons to jump from one atom field to field of the next atom at a faster rate. Imagine a raging hear of bulls running though the streets. Cooling would be like shrinking the buildings and widening the steets and more bulls can get through. "Other wise goodby shop windows and doors.". Like the flow of bulls it about volumn or flow rate which is the amperage and pressure which is voltage. Higher voltage equals more preassure which forces more amperage. Thus you are trying to shove more electrons trough which allows the gates to respond faster but at the price of possibly busting the gate.