- Joined
- Jun 21, 2009
Hi!
I have an interesting PC to sort out here. I can't overclock till I get the stock system stable
I'd like to go step at a time on the basis too much information is more helpful than too little -- and because it's quite an unusual case
Full spec is in the sig 
The problem
I'm not a gamer, but I'm possibly one of the more intensive PC users not into games. I'm hammering a load of subsystems - memory, multitasking, net connections, I/O, you name it, maybe half a dozen virtual machines, loads of disk access, and a bunch of simulations, on a daily basis, without much let-up. The system is up 24/7 and is capable of handling the load, and its stability and performance is immensely valuable to me. So like most overclockers, I'm looking for genuine hard-core rock-solid stability and system build quality from the get-go (if it's not stable under heavy load then its simply not stable full stop) so I can let it run without random BSOD crashes and restarting if I want to.
Unfortunately while the rig can easily handle the workload, it's not stable. I am trying to work out why not. I probably need at the least a very good PSU, possibly even move to water cooling. But before I spend any cash I'd like to ask advice on the weird things I'm seeing, and any suggestions how others would fix the rig that don't involve replacing items that don't need it
And what kind of OC I might expect 
Steps taken so far
I started with a Q6600, an Asus P5K series, Crucial Ballistix 1066 (2x2gb), and (significantly) a budget power supply that seemed good enough at the time - a Trust 570W.
After a month or 2 at stock settings, I started to notice the system wasn't as stable as it should be. 1st candidate was memory and RAMTEST confirmed memory issues. I swapped out some sticks of crucial ballistix, then some more, then yet more... before a tech guy commented that they had more returns of that than any other. At that point I replaced the ballistix for Corsair Dominator 2.10v. But weirdly, the problems persisted, the system stayed unstable, the new memory had a high failure rate on RAMTEST too (one stick was even DOA). So I swapped the motherboard as well, to my present Abit IP35 Pro XE. Finally RAMTEST seemed happy, and XP32 seemed happy, and I proceeded to enjoy
But the new system was still noticeably less stable than the old. I hit it with prime95 and found it would drop out with fatal rounding errors on random cores, after a random time. It didn't seem to do this on small FFT sizes which don't use much RAM. If I understand these results, it suggests a memory/mobo problem again.
By this time I'd also switched from XP32 to XP64 to get the use of the extra memory, and from SATA IDE emulation to native AHCI in the BIOS and drivers to squeeze an extra drop of performance from the hard drive systems. I don't know if either of those have a reputation for stability issues.
I'd also upgraded the stock Intel fan to an Arctic cooling freezer pro 7 + Antec formula 5 silver TIM and added a fan to the Corsair ram (which gets in the way of the CPU fan). And unplugged most of the HD's to reduce the strain on the PSU and isolate whether the problem was in the PSU. I also have a fan blowing on the HD's to keep them under 45 degC.
I ran Prime95 again with most HD's unplugged, the 2 extra fans, and a domestic high-volume room-cooling fan up close to the motherboard for air flow. The system was definitely more stable, for example it no longer gave an error under Prime95 after 9 hours, but the OS did still crash with stop 0x24 suggesting an NTFS fault - but booting the same PC on a spare system on another HD and testing the XP64 drive using chkdsk/SMART, says there wasn't a disk fault! Finally, when I allowed the PC 2 hours "off", it was then able to boot up fine on the XP64 disk, suggesting the problem was somehow linked to a possible system-wide cooling issue rather than a RAM issue?? You can see why I started to get perplexed!
What might be up
That's where I'm up to. My suspicions are one or more of the following:
The problem is there might be a multi-part issue. Those are notoriously hard to diagnose.
What I'm thinking of doing
I'm seriously thinking about replacing the PSU by a high quality one that I won't have to replace for years, and which will have enough headroom for an eventual shift to i7 - say around 800+ W, 5+ year warranty, modular (for neatness but also in case of new types of connection coming out) and a reputation for rock solid stability.
I'm also considering a move to water cooling, rather than air cooling. The CPU can get to 57 - 60 degC under load with air cooling, but I'm not convinced whether that's low enough to be stable. Any more and I'd have to move to water cooling. It would also remove a big heat-producing block of copper/aluminium from the middle of the case, and bypass the problem where the CPU fan intake is 1/2 blocked by the memory fan.
If those don't resolve the issue then at a pinch, I'd move to a new processor/mboard/memory completely (probably q9550/p5q3/1600ddr3) and sell off this one. I'd like to avoid that if possible - PSU and possibly cooling if I have to do those things will be a big enough "hit" to finance
Sadly I don't have alternate components for my PC to "swap in" to isolate any fault.
Advice? Hardware purchases? Anything else?
So... a long explanation for a complex diagnosis. Any thoughts how to approach this PC or interpret the findings would be appreciated! The symptoms somehow don't really make sense to me. Although as a last resort completely replacing everything would fix it, I'd like to avoid that.
If someone can suggest what the symptoms might mean, or information needed to better diagnose it, or even a suitable PSU and water cooling move (plus any other hardware changes to consider going for!)... then please do
I have an interesting PC to sort out here. I can't overclock till I get the stock system stable
The problem
I'm not a gamer, but I'm possibly one of the more intensive PC users not into games. I'm hammering a load of subsystems - memory, multitasking, net connections, I/O, you name it, maybe half a dozen virtual machines, loads of disk access, and a bunch of simulations, on a daily basis, without much let-up. The system is up 24/7 and is capable of handling the load, and its stability and performance is immensely valuable to me. So like most overclockers, I'm looking for genuine hard-core rock-solid stability and system build quality from the get-go (if it's not stable under heavy load then its simply not stable full stop) so I can let it run without random BSOD crashes and restarting if I want to.
Unfortunately while the rig can easily handle the workload, it's not stable. I am trying to work out why not. I probably need at the least a very good PSU, possibly even move to water cooling. But before I spend any cash I'd like to ask advice on the weird things I'm seeing, and any suggestions how others would fix the rig that don't involve replacing items that don't need it
Steps taken so far
I started with a Q6600, an Asus P5K series, Crucial Ballistix 1066 (2x2gb), and (significantly) a budget power supply that seemed good enough at the time - a Trust 570W.
After a month or 2 at stock settings, I started to notice the system wasn't as stable as it should be. 1st candidate was memory and RAMTEST confirmed memory issues. I swapped out some sticks of crucial ballistix, then some more, then yet more... before a tech guy commented that they had more returns of that than any other. At that point I replaced the ballistix for Corsair Dominator 2.10v. But weirdly, the problems persisted, the system stayed unstable, the new memory had a high failure rate on RAMTEST too (one stick was even DOA). So I swapped the motherboard as well, to my present Abit IP35 Pro XE. Finally RAMTEST seemed happy, and XP32 seemed happy, and I proceeded to enjoy
But the new system was still noticeably less stable than the old. I hit it with prime95 and found it would drop out with fatal rounding errors on random cores, after a random time. It didn't seem to do this on small FFT sizes which don't use much RAM. If I understand these results, it suggests a memory/mobo problem again.
By this time I'd also switched from XP32 to XP64 to get the use of the extra memory, and from SATA IDE emulation to native AHCI in the BIOS and drivers to squeeze an extra drop of performance from the hard drive systems. I don't know if either of those have a reputation for stability issues.
I'd also upgraded the stock Intel fan to an Arctic cooling freezer pro 7 + Antec formula 5 silver TIM and added a fan to the Corsair ram (which gets in the way of the CPU fan). And unplugged most of the HD's to reduce the strain on the PSU and isolate whether the problem was in the PSU. I also have a fan blowing on the HD's to keep them under 45 degC.
I ran Prime95 again with most HD's unplugged, the 2 extra fans, and a domestic high-volume room-cooling fan up close to the motherboard for air flow. The system was definitely more stable, for example it no longer gave an error under Prime95 after 9 hours, but the OS did still crash with stop 0x24 suggesting an NTFS fault - but booting the same PC on a spare system on another HD and testing the XP64 drive using chkdsk/SMART, says there wasn't a disk fault! Finally, when I allowed the PC 2 hours "off", it was then able to boot up fine on the XP64 disk, suggesting the problem was somehow linked to a possible system-wide cooling issue rather than a RAM issue?? You can see why I started to get perplexed!
What might be up
That's where I'm up to. My suspicions are one or more of the following:
- PSU - The PSU though rated at 570W may be unreliable. If a rail (eg to Ram) is weak or prone to instability under high demand, or the PSU is at its limits and the 570W is marketing puff, then that could account for all kinds of stability issues.
- Cooling - seemed to help, but old system (P4 based) didnt need more than stock cooling.
- Problem with the motherboard or RAM - either faulty as supplied and I've been unlucky, or became faulty (eg due to heat or a PSU issue). But how likely it could affect multiple sticks and 2 boards?
- AHCI firmware or driver issues - although not repsonsible for everything I am wondering whether the AHCI route is as stable as the IDE emulation route. I think I've noticed more problems in XP64 + AHCI mode than XP32 + IDE mode.
The problem is there might be a multi-part issue. Those are notoriously hard to diagnose.
What I'm thinking of doing
I'm seriously thinking about replacing the PSU by a high quality one that I won't have to replace for years, and which will have enough headroom for an eventual shift to i7 - say around 800+ W, 5+ year warranty, modular (for neatness but also in case of new types of connection coming out) and a reputation for rock solid stability.
I'm also considering a move to water cooling, rather than air cooling. The CPU can get to 57 - 60 degC under load with air cooling, but I'm not convinced whether that's low enough to be stable. Any more and I'd have to move to water cooling. It would also remove a big heat-producing block of copper/aluminium from the middle of the case, and bypass the problem where the CPU fan intake is 1/2 blocked by the memory fan.
If those don't resolve the issue then at a pinch, I'd move to a new processor/mboard/memory completely (probably q9550/p5q3/1600ddr3) and sell off this one. I'd like to avoid that if possible - PSU and possibly cooling if I have to do those things will be a big enough "hit" to finance
Advice? Hardware purchases? Anything else?
So... a long explanation for a complex diagnosis. Any thoughts how to approach this PC or interpret the findings would be appreciated! The symptoms somehow don't really make sense to me. Although as a last resort completely replacing everything would fix it, I'd like to avoid that.
If someone can suggest what the symptoms might mean, or information needed to better diagnose it, or even a suitable PSU and water cooling move (plus any other hardware changes to consider going for!)... then please do
Last edited: