• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

Memtest errors, then no errors for 24 hours after changing a bios setting

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

phatty2x4

Member
Joined
Aug 26, 2004
Location
Oregon
Threadripper 1950x
64gb G.Skill 3200mhz cas 14
ASRock x399 Taichi

It's been running fine for 8 months, then I started getting BSOD's. New error every time. I run memtest and get errors.

01.jpg

The first thing I do is update all my drivers and my bios (was running an older bios). Didn't work, still got BSOD's.
So I think, 'I'll just bump the voltage to 1.36v from 1.35v and see what happens.' I change the voltage and then run memtest for 25 hours without any errors.

02.jpg

I think it's fine, then a few days later I start getting BSOD's again. I run memtest and now I'm getting errors with the memory voltage bumped up, even though it just passed for 25 hours.

03.jpg

So I'm starting to think something is weird.... I change the voltage back down to 1.35v and..... no errors......

Something I've noticed is that the 12v rail has been dipping down to 11.7v. I know this is still in spec, but could it cause any issues? Could this still be the memory, or could it be something else (cpu)? I've NEVER seen ram do this before. It either errors out, or it doesn't. I've never seen it pass for 25 hours and then fail a couple days later, or start passing once I lower the voltage down.

I'm going to start testing individual sticks, but I'm afraid I'll get the same behavior and a pass in memtest won't mean anything because of the way its been acting.
 
Safe voltage is +/- 10%. It shouldn't affect memory stability but who knows.
I would check single sticks as nothing else seems unusual.
Is that at XMP settings ? Have you tried a bit higher SOC voltage ?
 
Safe voltage is +/- 10%. It shouldn't affect memory stability but who knows.
I would check single sticks as nothing else seems unusual.
Is that at XMP settings ? Have you tried a bit higher SOC voltage ?

I tested the PSU last night and while the software says that voltage drops to 11.669, my multimeter only measured between 12.06 and 12.12 the entire time. I also ran Prime95 and Furmark and couldn't get it to crash..

It's running XMP settings. I double checked and they're the right settings and timings for the ram. I haven't tried higher SOC voltage. I started testing a single stick last night, but I'm on 10+ hours now without any errors. It seems like I have to boot into windows, play games or work on my game in UE4 for a while, reboot and THEN memtest will give errors. This happened last night when I tried to run memtest. I worked on my game for about an hour, rebooted into memtest, selected test 7 and had an error within 30 seconds. I turn the computer and PSU off, turn it on and then I can't get it to error. It's the most frustrating thing to troubleshoot. Maybe just run it with 16gb for now and see if it'll crash again I guess?
 
Safe voltage is +/- 10%. It shouldn't affect memory stability but who knows.
I would check single sticks as nothing else seems unusual.
Is that at XMP settings ? Have you tried a bit higher SOC voltage ?
5% is ATX spec... but it's still in spec. :)
 
Ran one stick for 34 hours without errors. Since I ran the set for 25 hours without error and then started getting errors again, I'm not really trusting it 100%, so I'm running P95. I got an error when all four sticks were installed on P95. But so far with this one stick, no errors. If I still have no P95 errors when I get home, I'm going to try another stick.

memtest.jpg
 
Passed 24 hours of P95. Won't let me upload the triple screen screenshot for whatever reason, but oh well. Onto the next dimm.
 
One thing I have found which helps Ryzen systems with RAM consistency is disabling the "fast boot" in BIOS
 
Do you have your system in a case? I know this sounds odd for RAM and cooling is deemed unnecessary but in a Ryzen thread I frequent they have found they get random memory errors when their temps hit the mid 40°C range and up. I doubted it for a long time but have seen some of the testing. Maybe a fan on your stick would help?
 
With the inconsistency of the problem Johan45's cooling suggestion makes sense to me. Though you would think P95 stress testing, at least the "blend" option would produce errors of overheating was a problem with the RAM. I would give the RAM a tad more voltage and see if that helps.
 
Do you have your system in a case? I know this sounds odd for RAM and cooling is deemed unnecessary but in a Ryzen thread I frequent they have found they get random memory errors when their temps hit the mid 40°C range and up. I doubted it for a long time but have seen some of the testing. Maybe a fan on your stick would help?

This is interesting. The stick closest to the VRMs does get hotter than the others, but the system has been stable for 8 months. Not sure why this would be a problem now when it wasn't before. Regardless, I flipped my 2 120mm fans on the top of my case so they blow downward onto the VRMs and the ram. We'll see if it makes a difference.

With the inconsistency of the problem Johan45's cooling suggestion makes sense to me. Though you would think P95 stress testing, at least the "blend" option would produce errors of overheating was a problem with the RAM. I would give the RAM a tad more voltage and see if that helps.

I did try to bump the voltage on the ram by .01v and I still get the same errors.

I tested individual sticks, but I'm not getting any errors. The problem is, it's hard to re-produce what makes the errors occur. If I start testing from a cold boot, turn the computer off and then on again, or change a bios setting and boot, I don't get the errors for 24+ hours. I only get errors if I've been using the machine in windows for a while and then restart. I'm honestly leaning towards the memory controller at this point, but I have no idea how to test that. Maybe bump the SOC voltage like mentioned before? I did swap the stick by the VRM with the first one to see if anything happens. That plus having the fans blowing on it. I guess we'll see.
 
Yes, try bumping the memory controller voltage. Just bumping the vcore will probably include that anyway?
 
Back