• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

X299 and 7980XE in 2024

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.
id be stuck double checking voltage memory and c state config
Had the same BSOD again. The funny thing is, I was shutting down Windows when it happened this time. I was restarting the PC as I wanted to try increasing ram voltage. The first time the BSOD happened XMP was on. I turned it off, which is the state when the 2nd one happened. Seeing the above comment I looked at reported ram voltages. 1.184 with XMP off. Should be 1.20 for JEDEC right? I've not put 1.22v in bios, which gives a measured 1.20 now.

1705617944839.png
Anything else catching people's eyes with voltages here? I'm off to bed now. See if the system is still up in the morning. It will again be running Prime95-like workload, and I've got a large upload to do in parallel.
 
You can increase the vccsa and vccio to 1.2 as this can help stabilize the memory. If you are running AI Suit (asus) you should be able to see in real time what these are running at under windows. I see in your sig that you are running 8 sticks and the sa/io will help with this. What are you running the NB at? <- I have found that this can be the difference of up to ~8c (7980xe) when run at 2.6GHz (basic) instead of 3.2Ghz/3.3GHz. On mine 2.6 is ~1.25v and 3.2/3.3 is ~1.475v/1.5v
 
@mackerel
Since you have superior 4-channel bandwidth do you tune your memory for the lowest latency as opposed to the highest bandwidth?

@MaddMutt
Is 1.475 or 1.5V considered risky?

RIP Bill Paxton.
 
I did have another crash since going to bed. Found the PC at login screen a moment ago. I'm curious why it is happening now and not earlier when I first installed the CPU. The only other change is I updated to the latest nvidia driver, so I'm trying switching to slightly older Studio driver and see if that makes a difference. Edit: if that doesn't help, then I'll try switching back to my old CPU. That could then show if the problem is related to the CPU or a wider problem with my system. Basically system seemed to be working fine since I installed CPU on Tuesday, problems started Thursday evening after the driver update. I've now gone from GRD 546.65 to SD 546.33. I was on whatever the previous GRD was before updating.

You can increase the vccsa and vccio to 1.2 as this can help stabilize the memory. If you are running AI Suit (asus) you should be able to see in real time what these are running at under windows. I see in your sig that you are running 8 sticks and the sa/io will help with this. What are you running the NB at? <- I have found that this can be the difference of up to ~8c (7980xe) when run at 2.6GHz (basic) instead of 3.2Ghz/3.3GHz. On mine 2.6 is ~1.25v and 3.2/3.3 is ~1.475v/1.5v
You mean mesh? It's running mobo defaults whatever that is. BTW I forgot to say, I don't suspect ram although I haven't tested it hard either. It is just something to try.
I'm using hwinfo64 to monitor.

@mackerel
Since you have superior 4-channel bandwidth do you tune your memory for the lowest latency as opposed to the highest bandwidth?
I don't optimise beyond turning on XMP. To me bandwidth is king. Latency makes very little to insignificant difference.

Your voltages look ok, battery is getting low.. might be fussy about that.. stranger things have happened :)
Still well over 3.0V so personally don't think that's any concern.
 
Last edited:
Another BSOD. Note the previous one was "system thread exception not handled" and the latest one is "IRQ not lesss equal than", so this is just randomly unstable.

We can rule out the GPU driver then. But that leaves me with no possible explanation other than the CPU might be unstable in this system (combination with mobo, etc.). I'm tempted to put the old CPU back in to find out if that is the case or not. Maybe I'll try a bios reset in case any settings from the old CPU are not optimal for the new CPU.

The other X299 mobo I ordered should arrive tomorrow.
 
What are the core temps as these can swing to 15c+ from hottest to coldest when not delidded. Yes I meant the mesh (CPU-Z list it as NB). What is your AVX offset? In benching (short runs) I had to set the Offset to AVX -8 and AVX 512 -10, which was ~ 3.8ghz/3.6ghz with an evga 360 aio.
*This is a Very long shot as you are an experienced player :thup: -Take out the CPU and clean the contacts on it with alcohol. Spray the socket with Electrical Contact Cleaner. Using a flashlight and a magnifier, to verify the pins in the cpu socket. Did you remove the memory when you switched out CPUs? If so - clean the contacts and the mem slots.

** Another BSOD. Note the previous one was "system thread exception not handled" and the latest one is "IRQ not lesss equal than", so this is just randomly unstable.
^ IIRC this is a memory issue. This can run from not enough voltage on main memory/IMC to the mesh being run to fast/not enough voltage for speed set.
- Could also be a power issue as the 7980xe is power hungry. <- You can check this by cutting down the cores to 12c/24t and testing to see if you have a crash. This also reduces the heat produced so you can see if it might be heat related.
 
Temps: highest I got any core was 73C. It was more than 10C hotter than coolest, but still fine.

I'm running bios defaults on clocks. Right now I'm putting some work through it similar to Prime95 small FFT, hottest core 56C. Actually that sounds too low! Clocks 2.6 to 2.7 GHz, keeping in mind this is an AVX-512 load. Edit: it looks like it is running on the PL1 power limit = TDP.

A deeper clean of the CPU was on my mind. I only rediscovered my IPA after I installed it. The contacts did look less shiny than my own CPU. I'm pretty confident my socket is ok.

Power wise the 7980XE isn't much more than the 7920X. PL2 198W.
 
IMG_20240119_153151.jpg
I'm blind now. So many LEDs. I hope there is some software somewhere to control them. BIOS is from 2018 so probably never updated? Got POST, now to get an OS on it.
 
Do any of those LEDs have a diagnostic function? I hate the decorative LEDs they put in late model motherboards and videocards, they're just a waste of power because I really can't ever see them.
 
The tiny green one to lower right might be power, or something. The ones by the PCIe slots and ram are decorative as far as I can tell. RGB though. I'm about to look for any software that can control it, like turn it off.

Oh, I did find a bios update for it. Takes it up to 2020.
 
Slowly getting there.
I swapped the A380 with a 1650 so I don't have to bother with the power connector.
BIOS is very basic. No XMP, no CPU control at all.
Windows software can control the LEDs including turning them off.
I believe the system originally came with Win10 Home, so hoped it would activate. For some reason I couldn't install Win10 (just did nothing) so I installed 11 home instead. Not activating. I might try 10 again now the bios is updated.
 
I have to agree, it is a pretty sweet setup. I'd hit it :attn:

Bios update should help you out, I think its a memory thing.. but I haven't run that series of Intel, so I cant really speak on anything.
 
New system: The LED status is not saved between reboots so I have to run their Windows software each time to turn it off. New BIOS made no change, still can't install Win10, gave up on that for now. I'd like to put it in a case but the front panel headers are more compact pinout than enthusiast boards so I don't have an easy way to connect up power switch. Currently I use screwdriver method to manually short the pins.

Old system: After the bunch of crashes yesterday it then behaved for a long time from afternoon through to this morning. Then crash. I just tried swapping the ram as suggested elsewhere although I'm unconvinced if it will help. If it doesn't, then the old CPU will go back in.
 
IMG_20240122_145913.jpg
New system update: I got a cheap case for it. I was wondering what to do about the front panel connector, as it uses 2.0mm pitch pins as oppose to the more common 2.5mm pitch on enthusiast cases. Turns out if I only connect the power switch, it does fit.

Old system update: Since switching the ram out as suggested on another forum, I've not had a single crash. It's been over 2 days uptime now. It is still unclear why I had those crashes. Loose contact on the ram? Ram actually going unstable? New CPU didn't like it but it was fine the first two days? At some point I'll test the ram I took out, but it wont be any time soon. I'm just happy to have a stable system again.
 
Can you tell us the memory that you are now using? this also looks like only 4 sticks instead of the 8?
Post magically merged:

Can you tell us the memory that you are now using? this also looks like only 4 sticks instead of the 8?
 
On the new system pictured, it is running 4x 4GB Ripjaws 4 3333 but as no XMP it is running 2133.

On the old (previously unstable) system, it is currently running 4x 16GB 2R modules. I think they're Corsair Vengeance Pro RGB. They're rated at 3200 but I haven't turned on XMP yet. One step at a time. Since these are 2R modules, they should perform near enough same as the old twice as many 8GB 1R modules.
 
tm2.png

tm1.png

Ok, I'm NOT done with this yet. Yes, it has taken me 4 days to realise that I'm missing a channel - I haven't rebooted the system since I swapped the ram. I'm in the middle of a BOINC challenge elsewhere so I don't want to touch the hardware for another 2 days when it is over. But I guess that kinda shows the problem I had.

I think this might be beyond just reseating ram. Gonna get the alcohol out and give the CPU contacts a deeper clean. I did think the pads on the CPU placed in the system days before the crashes were less shiny than the one it replaced.

Edit: I can see the LEDs on all 4 modules are lit up. CPU-Z can read the SPD on all 4 modules.
 
Last edited:
Finally took apart the system to give the CPU a deep clean. That... was a bigger job than expected. Long story short, previous owner must have used liquid metal at some point. When I wiped some of the excess paste off the CPU (still in mobo socket) some shiny stuff came out. It almost got into the socket but I managed to clean it up first.

With CPU out I gave it a deep clean. With hindsight, I should have done this before I used it in the first place. More liquid metal was found behind the paste. Pads were a little dirty as I cleaned it with alcohol it came out dark. With all that cleaned I put it back in and... still only 3 channels of ram showing. Time to start moving ram around I guess but I don't intend to spend too long on this. Got other things I'd rather do and I can limp on with 3 channels until later.
 
Last edited:
Back