• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

Troubleshooting memory errors

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

Kophax

New Member
Joined
Jun 23, 2011
Greetings all, I've recently run into some problems with my PC and I'm wondering if anyone here can provide some insight or troubleshooting tips.

I'm cross posting this from my private forums so the context may seem a little off once i get into memtest questions.

SYMPTOMS:
- 2 BSOD in 1 week
- On boot up, randomly starting to get things like "Superfetch could not start"
- After both BSOD's, hard reset, back into desktop and flaky OS usage. Once game client said it was corrupt, a few times game client crashes when past login screen, other oddities in Vista desktop.
- Shutting down, waiting 10-15 seconds and booting up seemed to make game clients more stable, but lately superfetch problem has occurred more.

At first, I thought it may have been HDD so I tested a few things.

TROUBLESHOOTING SO FAR:
- Windows Memory Test (nothing)

- Windows CHECKFS (nothing)

- Prime95 on blend mode has errors (lots of memory tested), prime 95 on FPU/CPU with little memory did not have any errors (didn't run for an extended period of time, it was late). This lead me to believe memory was the culprit

- Memtest+ errors, see WOT below.

As mentioned below, there was a (poorly done?) OC on this PC at one point, but i've since reset BIOS settings to default and all memtest errors were run with this factory setup.

----------------------------

It's my first time using Memtest+ but here's a quick breakdown if any of you have any ideas:

My "old" PC is basically the same PC as this one:

OLD PC
E8400, OC'd to 3.83 Ghz, Increased FSB/Voltage/didn't play too much with memory.
4GB PC6400 DDR2
EVGA 780i SLI Mobo
video,etc....

I ran this set up for a good 6 months without any crashes or stability problems though I never ran memtest when testing the OC.

NEW PC
E8400, OC'd to 3.6 Ghz, FSB adjusted, CPU voltage left alone (8400 is known for 3.6ghz oc without having to adjust voltage, so i'm guessing he just changed FSB and left it...)
8GB PC8000 DDR2 (I think it's 8000)
EVGA 780i SLI Mobo
video, etc

Basically the new PC had more memory (also threw in a new video card) but slower clock on CPU. The guy who originally set up the OC on this PC never really ran it through any stability tests, so the first thing I did was run Prime95 on it to see if there were any problems with his OC.

About 5 seconds into test, Prime95 on Blend mode (lots of RAM tested) starts giving me Illegal sum/possible hardware issues.

I go into BIOS and set everything back to factory (memory auto, fsb auto, CPU @ 3.00GHZ), run prime95 again, same problem.

I boot up Memtest+ and it goes something like this:

Test 1: 148k-2GB GOOD, 2GB - 4GB GOOD, 4GB - 6GB GOOD, 6GB - 8GB GOOD, 8GB - 9GB ERRORS
Test 2: 148k-2GB GOOD, 2GB - 4GB GOOD, 4GB - 6GB GOOD, 6GB - 8GB GOOD, 8GB - 9GB ERRORS
Test 3: 148k-2GB GOOD, 2GB - 4GB GOOD, 4GB - 6GB GOOD, 6GB - 8GB GOOD, 8GB - 9GB ERRORS
Test 4: 148k-2GB GOOD, 2GB - 4GB GOOD, 4GB - 6GB GOOD, 6GB - 8GB GOOD, 8GB - 9GB ERRORS
etc... etc...

So what I'm seeing is it's constantly failing in the last segment of memory, address space 8300-8500ish. It failed on this segment in every test, leading me to believe there's a problem with one of the DIMMS.

I was trying to figure out why it was testing over the 8GB address space, only one post I found made reference to something that could explain it - PCI addressable space. This would mean that the 8-9GB range is actually addressable memory on one of the DIMMS.

Out of curiosity, I ran the test on my "old" PC and everything was fine except for test #5, about 80 errors in the 560MB range. I have seen many people failing on test #5, while this means there could be stability / hardware issues with the RAM, I think the fact that it only failed on test 5 (i didn't run them all) means that it's much more stable (though not good) than the current setup. This could explain why I've never had stability issues.





So, all that to say: I think one of the DIMMS may be faulty. It was late, so I didn't get to try anything else but my first tests today were going to be:

1) Swap DIMM slots (slots 3 to 1 and 1 to 3) on Mobo to see if I get consistent errors in new addressable space. This would leave me to believe it's one of the DIMMs.

2) If #1, try testing each DIMM one at a time. I'm not sure if you can do this with DDR2 or if it will POST errors for not having the 2nd pairing?

3) Change Slot pairings on Mobo (slots 2 and 4). I'm pretty sure I can switch to 2 and 4 even though 1 and 3 are empty?

4) Take the old 2x2GB modules out of my old PC and try them on this Mobo. If i get errors at same 560mb range, I can probably assume there are no problems with the mobo slots.

I'm new to Memory errors like this, but I'm curious if adjusting voltages/timings on the memory (instead of leaving them to auto) to factory specifications may make them stable?


I know it's a big wall, anyone familiar with this kind of troubleshooting?
 
Post the exact model of RAM you have, including timings and voltage at the modules rated frequency. Also post pic's of CPU-Z open to the CPU, Memory, and SPD tabs. And post the current DRAM voltage, either taken from the Hardware Monitor screen in the BIOS, or using software from within the GUI.
 
Thanks for such a detailed post! :welcome:

Out of curiosity, I ran the test on my "old" PC and everything was fine except for test #5, about 80 errors in the 560MB range. I have seen many people failing on test #5, while this means there could be stability / hardware issues with the RAM, I think the fact that it only failed on test 5 (i didn't run them all) means that it's much more stable (though not good) than the current setup. This could explain why I've never had stability issues.
#5 and #6 tend to be where the most errors come up. If you get failures on tests #1-#4, usually you won't even be able to get into the OS. The earlier tests are sort of sanity checks for basic stability. Still, errors in test #5 mean instability of one sort or another.

I'm new to Memory errors like this, but I'm curious if adjusting voltages/timings on the memory (instead of leaving them to auto) to factory specifications may make them stable?
It's quite possible. Factory defaults on your motherboard might be outside of your memory's actual recommended settings, so you usually should adjust them manually to make sure you're in good territory. It would be helpful to know the make, model number, and recommended speed/timings/voltage of the memory kit you have, like redduc said.

Once you find those, go make sure that they're all set properly in the BIOS, then run memtest again. You should be able to configure memtest to run just test #5 if you want to speed up detection a little bit. Once you get #5 stable for a few passes, go back to running the full test.
 
Back