• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

Guidelines for Thorough Stability Testing

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.
felinusz said:
Thanks :).

I'm still waiting to see if it is possible to break the article up into multiple posts, and add an index; for ease of reading, and reference (because it *is* terribly long).

Thanks for the info on the new version of memtest86 (about time the original version of memtest86 was updated...) - I will edit the original post where neccessary.

Toast does indeed detect errors; it'd be pretty useless otherwise ;)

It is possible, but not without killing everyone else's posts.

Also, I think HL2 is a new stability tester ;)
 
Nice guide, but I have a few questions about memtest. Some ppl say that 25 passes of test 5 is enough to say its stable. How about that? I even get about 14errors per pass with default test with 100x10, cpu at 1.7v, ram at 3.2v, single 512mb stick.... So it has to be errorfree for 24hours with default tests to be stable?
 
Fr3ak

Nice guide, but I have a few questions about memtest. Some ppl say that 25 passes of test 5 is enough to say its stable. How about that? I even get about 14errors per pass with default test with 100x10, cpu at 1.7v, ram at 3.2v, single 512mb stick.... So it has to be errorfree for 24hours with default tests to be stable?


A good question :).

About memtest86 test #5 specifically:


felinusz

Many people also use specific memtest86/memtest86+ tests by themselves, to test out a new FSB or memory overclock quickly in order to see if it’s likely to be stable or not. Tests 5 and 6 in particular are good for this. However, 24 hours of all the tests on loop is your end-all solution to memory stability testing.

I should add, that I personally test my memory with test #5 looped for 24 hours, in addition to using all of the default tests looped for 24 hours. Test #5 is a great tool.

About the 24 hour time period specifically:


felinusz

As with Prime95, 24 hours really is required for a complete and thorough memtest86/memtest86+ stability test, and for the exact same reasons. I’ll quote myself for reference sake.

From Above

When you are stability testing with Prime95, you want to run the Torture Test for at least 24 hours. Why 24 hours?

There is a very common misconception that if your machine can pass Prime95 stability testing for, say, four hours, your machine will be able to run stable, regardless of what you are doing, for four hours as well, without issue. This is simply not the case.

Prime95 often finds errors in its 16th - 20th hour of testing, a potential for instability that wasn’t found after only four hours of testing. After only four hours of Prime95, the potential for instability still exists. 24 hours of Prime95 is a slight ‘overkill’, but you can never be too careful. 24 hours is widely viewed as a sufficient time period to catch any instability that may be present, but by all means test longer if you are able.


Now, for fear of typing lots and lots (Which I know that am about to do!), let's take a look at stability in general - the concept, in order to answer this question properly.

When it comes to stability; something is either stable, or unstable. There is really no grey area, "between" stability and instability, your overclock/hardware is either stable, or it is unstable. Yet, it is common for us as overclockers to say things like:

"My overclock is 99% stable, it's almost there, I just need more voltage/testing"
"It is fairly stable at this speed"
"It is bench stable"

These are technically incorrect things for us to say. All of the above are actually unstable!

That said, the very nature of stability, means that there is always a potential for instability lurking around the corner. Just because you stress test for 1000000 hours, does not mean that the hardware cannot potentially crap out on you in the 1000001st hour of stress testing. There is no degree to hardware stability besides 100%, and 0%, yet also no true proof of 100% hardware stability, unfortunately.

This is why we need a standard, that has proven itself to be very thorough.

When we stress test our hardware, we want to be as thorough as we can. The "24 hour rule" is simply a standard, based on the experiences of myself, and many other overclockers. It is possible to stress test something for 24 hours, and still have a potential for instability. But is it likely? No, not very likely at all.

So, having gone about this in a very roundabout way, I can say that 25 passes of memtest86 test #5 is not very thorough, and is largely insufficient as a means of weeding out instability in your memory overclock.

If you want to test for longer then 24 hours non-stop, by all means do so - be as thorough as you can be. However, stress testing longer then 24 hours could be deemed redundant, and would not leave us much time for overclocking ;)!

My apologies for the rambling post :).
 
Last edited:
Hi, I'm a stability maniac. I test very deep all my rigs, multiple times.
My experienced tell me :
1- Repeat 3 times your favourite set of programs; often At the 2nd or even at 3rd round some errors pop-up.
2- Find a test that push all the compotent at full throttle at the same times, your cpu, memory, GPU, HDs. Under maxium stress maybe you PSU can't keep up all them.
3- run 3dmark01 also with prime95, I've found sometimes that prime95 and 3dmarks are ok alone, but together BSOD.(on little pump up on VCore fix it generally).
4-Memtest is only usefull to find the absolute Top speed of the sticks, but it's need back off some MHz (3-5 at least) to be really stable into Windows. Prime95 Blend mode is more effective even if partial incomplete (don't test all the ram).
5-Last but not least, If your test very deepen your rig and you've found the very limit of stability where 1 MHz more of fsb, you know, is instable territory, back off 3 Mhz, you've no margin at all otherwise, a slight fluctuations can crash your system.( with 3 Mhz less you loose nothing in performance but you have a some-headroom to compensate "noise".
 
i have one question
so basically u first find the top speed the computer will run "stabily" on
than, u run the three tests to make sure the computer is running right?
okay, than lets say errors show up?
than wat should i lower/downclock/etc?
sry i'm still a noob at overclocking
plz reply
 
mazzy

Hi, I'm a stability maniac. I test very deep all my rigs, multiple times.
My experienced tell me :
1- Repeat 3 times your favourite set of programs; often At the 2nd or even at 3rd round some errors pop-up.
2- Find a test that push all the compotent at full throttle at the same times, your cpu, memory, GPU, HDs. Under maxium stress maybe you PSU can't keep up all them.
3- run 3dmark01 also with prime95, I've found sometimes that prime95 and 3dmarks are ok alone, but together BSOD.(on little pump up on VCore fix it generally).


Have you tried adjusting the program priority when running Prime95, or 3DMark? With the priority on default, the tests are not as stressful, and not as thorough. With Priority adjusted and increased, Prime95 and 3DMark2001 should be fine to use individually.


4-Memtest is only usefull to find the absolute Top speed of the sticks, but it's need back off some MHz (3-5 at least) to be really stable into Windows. Prime95 Blend mode is more effective even if partial incomplete (don't test all the ram).
5-Last but not least, If your test very deepen your rig and you've found the very limit of stability where 1 MHz more of fsb, you know, is instable territory, back off 3 Mhz, you've no margin at all otherwise, a slight fluctuations can crash your system.( with 3 Mhz less you loose nothing in performance but you have a some-headroom to compensate "noise".


Great point. Variables that can affect your overclock (voltage droop/fluctuation, ambient temperature shifts) change over time, one does want a small margin for 24/7 use, even after testing :).


ccbl91

i have one question
so basically u first find the top speed the computer will run "stabily" on
than, u run the three tests to make sure the computer is running right?
okay, than lets say errors show up?
than wat should i lower/downclock/etc?
sry i'm still a noob at overclocking
plz reply

Yes, you have it exactly :). If you found instability (test errors) in your overclock after running some thorough stability tests, lowering the overclock a little bit is one way to regain stability. Raising the overvolt, and improving your cooling will also help, but are not always possible.

Just view thorough stability testing as properly 'tuning' your overclock for the highest speeds, while maintaining stability for day-to-day use. :)
 
Thanks felinusz for your reply. :)
I know that running Memtest and Prime for 24hours is a indicator that the rig is stable. My point was more: Can it be "stable" altough Memtest shows some errors? Stable means in this case that he pc can run 3 weeks nonstop without rebooting, passes 24hours of Prime95 and runs all games without crashing.
When I overclocked my first pc a few years ago, I tested it with prime95, but no memtest. It has been running fine since then and it never crashed, its the pc I trust the most, if you know what I mean. I tested it with memtest a few weeks ago, when I read more about memtest and had to do some testing with my barebone. I was quite surprised when Memtest had a few errors after a while, so I tested another pc with everything on stock and also got errors with it. I cant even get my rig 2) in my sig memtest stable, not at stock speed and not even at far below stock speed. That was the main reason I was posting here.
I managed to get it memtest 5 stable, prime95 at priority10 and 3dMark for 5 hours, but not memtest default tests. Might be my memory controller in combination with the ram I have casuing thise errors, but I am not too worried about it, because I can run all games without any problems, so I would call it "stable".
Hmm I kinda wrote a lot now, but I havent said that much at all, I guess what I wanted to say is something like: it diesnt always have to be 100% errorfree to be "stable". Maybe I just havent found the right programs to make it crash yet.
 
The same thing happened to me with my old OCZ PC3500 EL CH-5 modules - they caused hundreds of errors in memtest86 on test #5 at any speeds, with any amount of voltage, but they never gave me any issues in Windows, or during Prime95 torture testing. I was using the sticks in an ASUS A7N8X-DLX nForce2 board, which may have been the problem.

memtest86 errors are not always caused by the memory itself: chipsets, memory controllers, conflicts with the board and the specific memory ICs on the sticks, or even timing conflicts can cause errors in memtest86 as well.

Do these errors mean that your machine is not stable? Not neccessarily - memtest might be finding errors because of software (memtest86 being the software) issues or hardware conflicts that only manifest themselves during memtest86 torture testing.

When running things at stock speeds, errors like this can often be waivered by common sense (test the sticks in another motherboard at stock speeds to confirm their integrity). But when overclocking, it is better to be safe then sorry (hardware conflicts with memory do often cause instability, just look at the MSI Neo 2 boards).
 
The easiest way to determine if the chipset is erroring or not is usually in the number of errors. When I was big into overclocking my fsb I pushed to 265 on my NF7 and ran memtest. Even at 1.9V vdd I got around 25k errors.
 
Mate,

Could you include this your stability tests? Its supposed to run 5C hotter than Prime 95.
http://home.comcast.net/~wxdude1/emsite//download/stresscpu.zip

What is StressCPU?
This is a small windows program to torture-test your CPU in order to make sure that you don't have overheating problems. It will only run on SSE-equipped x86 CPUs, and it is executing a special version of the Gromacs innerloops that mixes SSE and normal assembly instructions to heat your CPU as much as possible.

The program was written by Erik Lindahl and I simply compiled it and am making it available to use.
This program actually makes my CPU's run from 4C-6C hotter than simply running Gromacs.
It's a good heat test and it should make any system draw maximum power which will also test the stability of your powersupply.

Let me know if you have any issues.
You can get the program in the EM-DC download area.

Larry
http://www.em-dc.com

Refrence thread:-
http://www.ocforums.com/showthread.php?t=362053
 
Hello, was just wondering about

Yes, you have it exactly . If you found instability (test errors) in your overclock after running some thorough stability tests, lowering the overclock a little bit is one way to regain stability. Raising the overvolt, and improving your cooling will also help, but are not always possible.

What about finding errors on a stock system? What do you do then? Double check the settings for the 500th time? Or could the default settings be so off that it is causing a game to crash.

The system appears fine on the outside, but crashs a game on a regualr basis. The crashs are pretty random, so I dont know what it is. Running memtest on it in the other room.

This might seem like a stupid question...but I have heard about sound cards negatively effecting video cards during certain games? Anyway to test their stability seperate or together?

Thanks for the help in advance,

_D
 
aptd

What about finding errors on a stock system? What do you do then? Double check the settings for the 500th time? Or could the default settings be so off that it is causing a game to crash.

The system appears fine on the outside, but crashs a game on a regualr basis. The crashs are pretty random, so I dont know what it is. Running memtest on it in the other room.

This might seem like a stupid question...but I have heard about sound cards negatively effecting video cards during certain games? Anyway to test their stability seperate or together?

Thanks for the help in advance,

_D

My philosophy about parts that will not run stable at stock speeds in a new system build - return them ASAP!

If it is an older machine, it might just be really dirty, or need a fresh OS install.

~ Check the operating temperatures if you can do so, to see if they are acceptable.
~ Clean the entire case out (dust and grime buildup can destroy your airflow and temperatures, and can cause instability as a result), and add a cheap fan or two for improved airflow if neccessary.
~ Check your machine for viruses and spyware.
~ Check the parts individually in another machine, if it is possible to do so.
~ Bring everything down to "failsafe" speeds (BIOS "failsafe mode", or lowest clock speeds possible), and see if the problems persist.

As for sound cards, you could run a check with the sound card not installed, and then run a check with the sound card installed, and see if there is a difference in the result. If stability is cmpromised only when the soundcard is installed, then you know what is causing the problem :).


Super Nade

Mate,

Could you include this your stability tests? Its supposed to run 5C hotter than Prime 95.

Cool, thanks a lot for the link man! :)

That is a very promising little application, it is well known that gromacs is very stressful. I am going to add it to the guide in the CPU section - and I will be testing it out myself as soon as I get a chance to do so!
 
Yes, I see.

The problem is, which part to send back. Every part is brand new for a brand new system. Guess I will continue running tests and trouble shooting. Thanks for the help. I will not clutter up this great thread, maybe Ill post something over in another section.

Great Thread once again!

_D
 
It runs in console mode only. I'm not sure if the DOS emulator on Windows would casue any issues. I ran it for a couple of minutes and faced no problems. More detailed tests will follow on Monday. I definitely saw a 2-3 C increase than P95 small FFT.
 
aptd

Yes, I see.

The problem is, which part to send back. Every part is brand new for a brand new system. Guess I will continue running tests and trouble shooting. Thanks for the help. I will not clutter up this great thread, maybe Ill post something over in another section.

Great Thread once again!

_D

I have sent you a PM :)


Super Nade

It runs in console mode only. I'm not sure if the DOS emulator on Windows would casue any issues. I ran it for a couple of minutes and faced no problems. More detailed tests will follow on Monday. I definitely saw a 2-3 C increase than P95 small FFT.

Please keep me updated, I am expecting a lot from this program! :)
 
heya, just wanted to ask about something ive noticed as i have been doing some stability testing now that my machine is finished.

basically prime95 went for more than 24 hours doing a damn good job during the end of this week and my memory is within spec (i got them UTTbh-5 i think memory) anyways, test 5 will run forever on memtest but test 7 produces errors immediately, whats that all about?

++
so now i find out the address of the memory error points to the spot 473.7MB and it pops up in test 7 AND 9 AND 11, WHAT is that about!? :bang head
 
Last edited:
I would start by increasing your VDIMM overvolt, and seeing if that alleviates the problem. What memory speed, memory timings, and VDIMM voltage are you using when you get these errors? UTT likes a high VDIMM, and may not run stably for you without an overvolt, depending on the timings which you are using.


As for tests #7, #9, and #11:

memtest86

Test 7 [Random number sequence]
This test writes a series of random numbers into memory. By resetting the seed for the random number the same sequence of number can be created for a reference. The initial pattern is checked and then complemented and checked again on the next pass. However, unlike the moving inversions test writing and checking can only be done in the forward direction.

Test 9 [Bit fade test, 90 min, 2 patterns]
The bit fade test initializes all of memory with a pattern and then sleeps for 90 minutes. Then memory is examined to see if any memory bits have changed. All ones and all zero patterns are used. This test takes 3 hours to complete. The Bit Fade test is not included in the normal test sequence and must be run manually via the runtime configuration menu.

Test #11 has been removed from the newest version of memtest86.

It is of note that it is a good idea to run through all of the tests looped for a ~24 hour time period - test #5 alone is not thorough enough, although it is a superb quick memory overclock stability check. Test #5 is a great tool when working in some new memory, and quickly scaling it upwards, but by itself it is not enough to declare an overclock "safely stable".

I hope that helps dude :)
 
yea man thanks, is there anything you can tell me about the fact that it returns the error at the same "spot" in the ram? should i ask these memtest peeps that? i guess i should get the newest version then. and as for vdimm, im using mobo max of 2.85 and the ram is at about 208 which should be alright, i think the max it can reach at 2.85 v before you need to really go more is like 215-220mhz so i dont think thats it. what i will do however, is let it run the new memtet all day when i leave for work shortly, and like i said i wonder if its something to do with my ram ITSELF that makes the test return the same spot for errors
 
Back