• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

PSA for 13th/14th gen Intel CPU owners

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.
I've had 12900K, 13900KS, 14900K, and 14900KS. The only one that was not affected by any of the stability/degradation issues was the 12900K. From gathering knowledge over the last couple of years the problem is the single core boosting/vcore being used to obtain those boost clocks. The microcode/BIOS updates do not fix the issues, its simply a bandaid until they can push the next generation of CPUs out. To resolve or protect your Intel 13th/14th gen CPUs you need to first stop it from boosting, and second you need to tune the vcore I.E. dont let the BIOS shove whatever it wants at it. So basically, lock the cores down to prevent boosting and then tune your vcore manually. Unfortunately my 14900KS degraded so fast, it got RMA'd within 3 months of purchase.
 
Like Woomak alluded to, I'm of the thinking that this is much ado about not much.

My main rig, 13700k/MSI z790 Tomahawk Wifi D4 has been running fine and dandy for 10-12 months (need to update my profile). No idea what bios is on the MB currently as the rig is in parts waiting for new mb, psu and ddr5 memory (got bored). I generally don't bother with benchmarks unless I'm having an issue so nothing to compare to. Sorry, I find benchmarks boring compared to a good game. Rather, I use games as stability testers...yeah, yeah, I'm lazy. Wanna fight aboudit? Hehe. Last bios update was 5 to 6 months ago. So likely badly outdated. I'll obviously update the new mb to the latest when I get it installed but I'm genuinely not concerned. Still have 4 more years to go on the updated warranty after all. RMAs are old hat if it comes to that.

Will try my best to update with more info when the new gear is installed...if my failing memory doesn't leave me hangin. Altho, I doubt my voltages etc are going to help anyone clarify much.
 
Like Woomak alluded to, I'm of the thinking that this is much ado about not much.

My main rig, 13700k/MSI z790 Tomahawk Wifi D4 has been running fine and dandy for 10-12 months (need to update my profile). No idea what bios is on the MB currently as the rig is in parts waiting for new mb, psu and ddr5 memory (got bored). I generally don't bother with benchmarks unless I'm having an issue so nothing to compare to. Sorry, I find benchmarks boring compared to a good game. Rather, I use games as stability testers...yeah, yeah, I'm lazy. Wanna fight aboudit? Hehe. Last bios update was 5 to 6 months ago. So likely badly outdated. I'll obviously update the new mb to the latest when I get it installed but I'm genuinely not concerned. Still have 4 more years to go on the updated warranty after all. RMAs are old hat if it comes to that.

Will try my best to update with more info when the new gear is installed...if my failing memory doesn't leave me hangin. Altho, I doubt my voltages etc are going to help anyone clarify much.

I wish that were true, but unfortunately the numbers we have available says that it is a significant problem for Intel. Glad your CPU doesn’t seem to be affected.
 
Last week I updated my AORUS Z790 to F12d BIOS.
Performance seems good. Passed many stress tests for CPU/memory (P95/Cinebench/Karhu). 14900k scores 40k+ on Cinebench. I keep HWInfo running in the background to monitor WHEA errors and it didn't report any.
On the other hand, I experienced two hard shutdowns, which I haven't seen on previous bios. By hard shutdown I don't mean BSOD - I mean the PC goes fully down and I hear the PSU click as if it's turned off and then right back on. It happened while idle.
Not sure what to make of this.
 
I wish that were true, but unfortunately the numbers we have available says that it is a significant problem for Intel. Glad your CPU doesn’t seem to be affected.
The numbers we have available are precisely the issue. No one knows whats true or what's fud. What's for certain is the bigger the number the higher the clicks (well someone knows but that someone will never spill). Are they failing at the whopping 10-12% some sources have claimed? Or maybe it's the sensational 47-50% failure rate claimed by other verrry reliable sources? Lol. Regardless, I don't fancy being led around by the nose by a group of -cough- journalists whose livelihood is strictly based upon the almighty click. Obviously there's a sizable problem, no one is denying that. But having no actual idea as to what the real scope is, other than what's been "reported" so far doesn't sit well with me. Anyhow, in reality, it's likely a much smaller percentage than anything we've been spoon fed IMO. The failure rate is probably much closer to the average RMA rate but what do I know? I've only built a handful of the affected vs the bazillions these reliable sources claim to deploy...

BTW, it's been written in stone now, that ST boosting resulting in crazy 1.6v+ overvolting is what's degrading/killing the CPUs correct?
 
The numbers we have available are precisely the issue. No one knows whats true or what's fud. What's for certain is the bigger the number the higher the clicks (well someone knows but that someone will never spill). Are they failing at the whopping 10-12% some sources have claimed? Or maybe it's the sensational 47-50% failure rate claimed by other verrry reliable sources? Lol. Regardless, I don't fancy being led around by the nose by a group of -cough- journalists whose livelihood is strictly based upon the almighty click. Obviously there's a sizable problem, no one is denying that. But having no actual idea as to what the real scope is, other than what's been "reported" so far doesn't sit well with me. Anyhow, in reality, it's likely a much smaller percentage than anything we've been spoon fed IMO. The failure rate is probably much closer to the average RMA rate but what do I know? I've only built a handful of the affected vs the bazillions these reliable sources claim to deploy...

BTW, it's been written in stone now, that ST boosting resulting in crazy 1.6v+ overvolting is what's degrading/killing the CPUs correct?

Wendell's numbers from his test on a large number (by customer standards) of systems was eye opening and he is probably the least click bait tech youtuber there is. Statistically, it is still a small sample size that I don't expect actually holds across the broad market, but given his numbers and the numbers quoted to him from large OEMs, it is clear that this is a significant issue, unless you believe all the people with industry contacts are lying. Additionally, we have multiple reports, and even email captures, of customers trying to RMA their CPUs and being told that there is no stock available for replacements of either the 13900K or 14900K, that is not a normal RMA situation. Of course, only Intel knows the actual numbers and they will never tell, so unless there is internal communications that are exposed as part of the incoming lawsuits, we'll never know the full scope of the issue, but all evidence points to it being a significant issue, much more than normal RMA failure rates.

As far as what's causing the issue, Intel has still not stated what the issue actually is. We do know that they had a manufacturing flaw in their RPL production lines that was causing oxidation and leading to rapid degredation and failure, but we don't know how many CPUs were produced with this flaw outside of it starting towards the end of 2022 and not being fully fixed for about 18 months. Outside of that, there seems to be an additional issue of Intel pushing the voltage higher than the CPUs can sustain without degrading much faster than they should. The only confirmation we have on this is that Intel's latest microcode to address the issue limits the max voltage to 1.55 V. We still don't have confirmation, though, that going above this voltage was the root cause of failure or that staying below this voltage is sufficient to stop the CPUs from degrading quicker than they should.
 
Last week I updated my AORUS Z790 to F12d BIOS.
Performance seems good. Passed many stress tests for CPU/memory (P95/Cinebench/Karhu). 14900k scores 40k+ on Cinebench. I keep HWInfo running in the background to monitor WHEA errors and it didn't report any.
On the other hand, I experienced two hard shutdowns, which I haven't seen on previous bios. By hard shutdown I don't mean BSOD - I mean the PC goes fully down and I hear the PSU click as if it's turned off and then right back on. It happened while idle.
Not sure what to make of this.
I had another hard shutdown. Happened on very light load. Flashing back to F10 bios.
 
Wendell's numbers from his test on a large number (by customer standards) of systems was eye opening and he is probably the least click bait tech youtuber there is. Statistically, it is still a small sample size that I don't expect actually holds across the broad market, but given his numbers and the numbers quoted to him from large OEMs, it is clear that this is a significant issue, unless you believe all the people with industry contacts are lying. Additionally, we have multiple reports, and even email captures, of customers trying to RMA their CPUs and being told that there is no stock available for replacements of either the 13900K or 14900K, that is not a normal RMA situation. Of course, only Intel knows the actual numbers and they will never tell, so unless there is internal communications that are exposed as part of the incoming lawsuits, we'll never know the full scope of the issue, but all evidence points to it being a significant issue, much more than normal RMA failure rates.

As far as what's causing the issue, Intel has still not stated what the issue actually is. We do know that they had a manufacturing flaw in their RPL production lines that was causing oxidation and leading to rapid degredation and failure, but we don't know how many CPUs were produced with this flaw outside of it starting towards the end of 2022 and not being fully fixed for about 18 months. Outside of that, there seems to be an additional issue of Intel pushing the voltage higher than the CPUs can sustain without degrading much faster than they should. The only confirmation we have on this is that Intel's latest microcode to address the issue limits the max voltage to 1.55 V. We still don't have confirmation, though, that going above this voltage was the root cause of failure or that staying below this voltage is sufficient to stop the CPUs from degrading quicker than they should.
I think the prevailing theory is that because voltage is still able to spike for short boosts on a single core its degrading the ring bus. So while the new BIOS has an overall average lower vcore the transients are still high so people dont see a loss of performance.

It really feels like Intel is trying to make the problem small enough that they can cover RMA's but not really taking responsibility so they dont get crushed or lawsuits.
 
I think the prevailing theory is that because voltage is still able to spike for short boosts on a single core its degrading the ring bus. So while the new BIOS has an overall average lower vcore the transients are still high so people dont see a loss of performance.

It really feels like Intel is trying to make the problem small enough that they can cover RMA's but not really taking responsibility so they dont get crushed or lawsuits.
Completely agree re Intel and massive recalls/lawsuits. Manufacturers often know the flaws before rollout across many industries.
 
Wendell's numbers from his test on a large number (by customer standards) of systems was eye opening and he is probably the least click bait tech youtuber there is. Statistically, it is still a small sample size that I don't expect actually holds across the broad market, but given his numbers and the numbers quoted to him from large OEMs, it is clear that this is a significant issue, unless you believe all the people with industry contacts are lying. Additionally, we have multiple reports, and even email captures, of customers trying to RMA their CPUs and being told that there is no stock available for replacements of either the 13900K or 14900K, that is not a normal RMA situation. Of course, only Intel knows the actual numbers and they will never tell, so unless there is internal communications that are exposed as part of the incoming lawsuits, we'll never know the full scope of the issue, but all evidence points to it being a significant issue, much more than normal RMA failure rates.

As far as what's causing the issue, Intel has still not stated what the issue actually is. We do know that they had a manufacturing flaw in their RPL production lines that was causing oxidation and leading to rapid degredation and failure, but we don't know how many CPUs were produced with this flaw outside of it starting towards the end of 2022 and not being fully fixed for about 18 months. Outside of that, there seems to be an additional issue of Intel pushing the voltage higher than the CPUs can sustain without degrading much faster than they should. The only confirmation we have on this is that Intel's latest microcode to address the issue limits the max voltage to 1.55 V. We still don't have confirmation, though, that going above this voltage was the root cause of failure or that staying below this voltage is sufficient to stop the CPUs from degrading quicker than they should.
Of course there are going to be stock shortages. Everyone and their brother are panic claiming RMAs right now lol. There are literally hundreds of threads out there of people asking if they should RMA their proc because they experienced a crash ONCE playing a game. One crash...insta panic. And every um, expert, advises them to return it rather than doing even basic troubleshooting. What's to be expected? No company expects a huge glut like this or is prepared for it and they certainly aren't going to be looked upon as the big bad denier of false RMA claims. Rather they simply approve them all and make the masses happy. A years worth of returns in 2 months...yep they have stock for that lol. After this panic season what will their return rates look like? Probably next to nothing. There won't be anyone left! Hahaha.

Again, reality and whats been "reported" veer way off from what I'm seeing. People are waiting a few days, three at most for RMA approvals. Many times 13th gen are being upgraded to 14, unasked. It's certainly not the OMGarsh Intel denied my RMA bs that was spread across infinity for months. I have seen one person being denied an RMA recently and that was months ago. Since then. Not one. Intel is covering their rears both via the RMA process and the media. They issued a press release a few days ago btw, I'm surprised it hasn't been posted here? They aren't stupid enough to create more bad press for themselves by denying or making returns difficult. But that's just what little ol me is seeing online.
 
Of course there are going to be stock shortages. Everyone and their brother are panic claiming RMAs right now lol. There are literally hundreds of threads out there of people asking if they should RMA their proc because they experienced a crash ONCE playing a game. One crash...insta panic. And every um, expert, advises them to return it rather than doing even basic troubleshooting. What's to be expected? No company expects a huge glut like this or is prepared for it and they certainly aren't going to be looked upon as the big bad denier of false RMA claims. Rather they simply approve them all and make the masses happy. A years worth of returns in 2 months...yep they have stock for that lol. After this panic season what will their return rates look like? Probably next to nothing. There won't be anyone left! Hahaha.

Again, reality and whats been "reported" veer way off from what I'm seeing. People are waiting a few days, three at most for RMA approvals. Many times 13th gen are being upgraded to 14, unasked. It's certainly not the OMGarsh Intel denied my RMA bs that was spread across infinity for months. I have seen one person being denied an RMA recently and that was months ago. Since then. Not one. Intel is covering their rears both via the RMA process and the media. They issued a press release a few days ago btw, I'm surprised it hasn't been posted here? They aren't stupid enough to create more bad press for themselves by denying or making returns difficult. But that's just what little ol me is seeing online.

For sure there is a bit of a fervor around the issue that is probably driving more than necessary returns, but there is plenty of evidence and even admissions from Intel that this goes well beyond normal failure rates, so to suggest that everything's fine, not a big deal, seems like willful ignorance to me.

Not sure what press release you are referring to, Intel puts those out quite frequently.
 
well beyond normal failure rates,
I wonder that actually is. Are we talking double? From say 5% to 10%? What's the historical failure rate after 1 year (2? 3? 4? 5?)...

EDIT: Here's a snip from Puget systems...

1725999451840.png
 
Last edited:
I wonder that actually is. Are we talking double? From say 5% to 10%? What's the historical failure rate after 1 year (2? 3? 4? 5?)...

Only the manufacturers and big OEM/ODM partners will know those numbers for sure, but for high performance consumer ICs, the general consensus seems to be < 5%, with the vast majority of failures happening at the beginning of life (i.e., DOA or very early in usage). Outside of that, you typically don't find many failures until the products reach their end of life stage and failures begin to increase rapidly, but that's obviously to be expected and should only happen outside of warranty. With RPL, that aging process was drastically accelerated and CPUs are reaching end of life stage way too soon. How many RPL units are experiencing this is something only Intel knows, and for sure they won't be telling anyone that number, but it's enough to cause a lot of headaches for them when they least need them.
 
Curiously, from Puget's dataset, we aren't' seeing a ton of failures there yet. Obviously 14th-gen is beat up (vast majority of failures at the beginning)........though similar to their 2021 numbers on 11th gen.

From another article...

1725999807945.png


Clearly these can't be taken as The Gospel, but does give a good snapshot in time. What happens in the future is the question... but, hell, unless they love pointing large caliber weapons at their own foot, I can't imagine that value to be too high (I have no idea what too high actually means.......).

But if the average failure rate is 2% and this is 10x worse, 4/5 are still good. Clearly that's terrible in context, but, I'm a gambler and get tickled with ~49% odds of success in Blackjack, LOL.
 
so to suggest that everything's fine, not a big deal, seems like willful ignorance to me.

Not sure what press release you are referring to, Intel puts those out quite frequently.
Obviously you haven't been paying attention to my posts. Like I said in my initial post, theirs clearly a big problem. I'm simply being real rather than following our trusty click driven YouTuber "reporting". If you can't handle alternate points of view, meh you do you. Personally, I find that close minded take a bit lacking in the logic department. We have no facts, yet we blindly believe these people? Why? This click driven "reporting" (being what it is), has blown things up into this reddit type of maelstrom. I have a big issue with that. Let's dig a little on our own. Like my last post alluded to, there's more here than what were being spoon fed by you tube.

The release was about the Vmin Shift Instability issue and the latest bios updates. I don't have time to find it atm.
 
I wonder that actually is. Are we talking double? From say 5% to 10%? What's the historical failure rate after 1 year (2? 3? 4? 5?)...

EDIT: Here's a snip from Puget systems...

View attachment 368452

Yeah, unfortunately we don't know Puget's testing methods or what they consider a failure. For instance, many users never knew they had degraded CPUs until trying to run specific games. Why those specific games? Because only recently some games started actually checking the output of the shader compilation / decompression work and throwing an error if it was incorrect. If you buy a Puget system for photo and video editing, you may have a degraded CPU but may not experience an issue, or you have issues in the background you just aren't noticing yet, or the degradation is happening but it isn't effecting your work flow yet, or your CPU may be fine. The whole range is possible. It is also interesting to note that Puget made the comment that a higher number of their 13th gen CPU failures were happening after delivery rather than being discovered during their own testing on the systems before shipment. That either means that their own testing is insufficient to expose the issue, or that the issue, as many are finding, is rapid degredation and so the problem only becomes detectable after a certain amount of time using the system. Or it could be a combination of the two.

That's why I think Wendell's testing is valuable. He found a few loads that seem to reliably expose the issue and was given access to a much larger sample size than anyone else had access to in order to test systems. Now, there is still a bunch of caveats with extrapolating his testing to the broader market (non-random sample leading to sample bias, low sample count, etc.), but it did show that there was a real issue at hand for at least a certain amount of RPL products that is not seen with other CPUs. That's the whole reason I started this thread, because even today, determining the cause of the issue and whether your CPU is degraded or not is not always a clear cut process. I just didn't want people pushing off getting an RMA until after their warranty expired if they had crashes (or other issues) thinking it was something else, when it could very well be their CPU, especially since Intel was being as clear as mud about what was happening. Now that Intel has extended the warranty period and with all the awareness that is out there, it's not as much of a concern in my eyes, though I still think Intel should be more transparent about what is happening.
Post magically merged:

Obviously you haven't been paying attention to my posts. Like I said in my initial post, theirs clearly a big problem. I'm simply being real rather than following our trusty click driven YouTuber "reporting". If you can't handle alternate points of view, meh you do you. Personally, I find that close minded take a bit lacking in the logic department. We have no facts, yet we blindly believe these people? Why? This click driven "reporting" (being what it is), has blown things up into this reddit type of maelstrom. I have a big issue with that. Let's dig a little on our own. Like my last post alluded to, there's more here than what were being spoon fed by you tube.

The release was about the Vmin Shift Instability issue and the latest bios updates. I don't have time to find it atm.

I'm fine with alternative view points, I was just objecting to your hand waiving away the data/evidence that has been presented by some of the youtube personalities. I agree that some have tried to get rage clicks and all that, but not all youtubers are the same and Wendell specifically was very level headed about the whole issue and provided probably the most complete data driven evidence we have on the issue. If we just ignore the best evidence we have available, then it's just going to be people's anecdotal experience driving their opinions which will not be a fruitful discussion.

Edit: Here is the release you are talking about. I saw it but didn't link to it because there was nothing new in it to discuss outside of Intel giving it an official name (which is really just a more technical sounding way of saying rapid degradation).
 
Last edited:
Report claims 14900k are out in Hong Kong so the distributor is offering refunds. Some rightly question what position does that leave the user in? They have an Intel mobo with no CPU. Buy a lower tier CPU? If you had a 14900k you're likely after "the best" and lower wont really cut it. Fire sale the mobo and either switch to AMD or wait for Arrow Lake if they're still ok with Intel?

As a wider thought, how much verification are Intel doing on RMAs? Basically, my question is what % of returns actually have a fault with them? Are they testing them to get data, or is it going straight to a recycling point? How much testing would be needed to be confident if a CPU was both fault free and still had sufficient operating margin to reuse? That could open up additional supply for RMAs if the cost burden isn't excessive.
 
Another update from Intel who have further narrowed down the part of the CPU that is getting hit by the voltages is a clock tree, leading to a duty cycle shift and the instabilities.

They will be releasing another microcode to resolve another contributor state of elevated voltages while the CPU is idle or under light loads. This microcode will be 0x12B and performance of it is claimed to be within measurement tolerance of 0x125. I've not kept track of which version did what to performance but it was interesting it wasn't compared to previous release 0x129.
 
Last edited:
I'm not an engineer. What I don't know, or maybe unaware of, is why did these issues start cropping up recently? In one of today's tech news, write mentioned that Intel was blaming mobo's makers. Is that really what is going on, that mobo makers were exceeding "limits" of sorts?
 
I'm not an engineer. What I don't know, or maybe unaware of, is why did these issues start cropping up recently? In one of today's tech news, write mentioned that Intel was blaming mobo's makers. Is that really what is going on, that mobo makers were exceeding "limits" of sorts?
I don't have an exact timeline, but very roughly the first rumbles there might be some kind of problem with these CPUs started popping up in February. Around April it got loud enough that Intel started doing something about it and the first step was to curtail some of the more excessive optimisations mobo manufacturer's set. It should be noted that in the past, and it might still be the case, system builders were given the freedom to freely adjust power limit alone. This was not considered overclocking and was allowed. There are many other things that can be tweaked, and I don't know which ones would or wouldn't be considered out of spec by Intel. As time moved on Intel found an eTVB bug, which was not the cause but still they fixed it. Later again they found the voltage problem, and the latest one was which part got worn out first by that excessive voltage, along with yet more voltage optimisations.

So back to the question if mobos were to blame, kinda but not. Without the latest fixes, it seems like the CPUs would still degrade at a higher than expected rate over time, although maybe it would have been slowed down for some people.

If you've ever had to work in product support, it really isn't as simple as some think that you report a problem, someone looks at it and fixes it quickly. It could happen for something simple, but this was far from simple. It is about worst case since it varies slowly over time. Problems that are easy to repeat on demand are much faster to find.
 
Back