SOLVED Kernel-Power Event ID 41 Critical Error

BoundByBlood · May 21, 2011

I don't know if I'm putting this in the right subforum, but here it goes.

Before I begin my system specs:
AMD Phenom II X4 965 BE (AC MX-2 Thermal paste/Scythe Big Shuriken heatsink)
Biostar TA790GX A3+ Motherboard
4GB DDR3 PC3-10600 G.Skill RAM (2GB x 2 sticks)
Diamond Multimedia Radeon HD 5770 1GB
Western Digital SATA II Caviar Black 640GB HDD 32MB Cache (Primary partition)
Western Digital SATA II Caviar Blue 320GB HDD 16MB Cache (Secondary partition/Storage Device)
Windows 7 x64 SP1 (Installed only on Caviar Black 640GB)
Antec Earthwatts 650W PSU

I've been experiencing the most mind boggling and frustrating comp problem and I really need the community's help so if anyone has any insight please let me know. I've tried everything I can think of, I've contacted tech supports for different hardware vendors and have tried all likely solutions I could find on the net unsuccessfully so here I am pleading to you all for help.

So here it goes. I do have to give a bit of a backstory before I can dive into the heart of the matter (I apologize) since the info might be relevant. My previous build was a Phenom II X2 550 BE with a GeForce GTS 250 and my system was running fine. The only relevant upgrades I bought were the 965 and the Radeon HD 5770 (I did buy a keyboard and some speakers, but I know the problem I'm having is not related to peripheral devices) and everything else stayed the same.

Something to note is ever since I've bought my Antec PSU it has always made a loud periodic clicking noise, like a sudden "clunk", maybe every thirty minutes or every two hours (not sure on the intervals since I've never timed it), but despite this my system has always worked. I rarely got BSODs, freezes, or crashes and most of the time if I did have an error it was usually something minor such as an app hanging. All in all my system ran fine when I was running the 550.

When I did install the new 965 cpu and the radeon 5770 I also did a clean of Windows by deleting the partition on the caviar black, creating a new one, and formatting it. I didn't want take the chance of some obscure driver/software error in my previous installation creating a conflict with the new hardware.

After installing the upgrades the system ran fine for a week or so, but then I started getting random reboots where the system would unexpectedly restart itself. Looking into event viewer the log lists it as a "Source: Kernel-power Event ID 41 Task Category (63)" critical error. I've never heard of this particular issue before I started experiencing it.

Before I continue here's a look at the code:

Log Name: System

Source: Kernel-Power Logged: 5/20/2011 9:25:17 PM

Event ID: 41 Task Catgory: (63)

Level: Critical Keywords: (2)

User: SYSTEM Computer: Dre-PC

OpCode: Info

The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.

- <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
- <System>
<Provider Name="Microsoft-Windows-Kernel-Power" Guid="{331C3B3A-2005-44C2-AC5E-77220C37D6B4}" />
<EventID>41</EventID>
<Version>2</Version>
<Level>1</Level>
<Task>63</Task>
<Opcode>0</Opcode>
<Keywords>0x8000000000000002</Keywords>
<TimeCreated SystemTime="2011-05-02T22:32:46.956813000Z" />
<EventRecordID>7346</EventRecordID>
<Correlation />
<Execution ProcessID="4" ThreadID="8" />
<Channel>System</Channel>
<Computer>Dre-PC</Computer>
<Security UserID="S-1-5-18" />
</System>
- <EventData>
<Data Name="BugcheckCode">0</Data>
<Data Name="BugcheckParameter1">0x0</Data>
<Data Name="BugcheckParameter2">0x0</Data>
<Data Name="BugcheckParameter3">0x0</Data>
<Data Name="BugcheckParameter4">0x0</Data>
<Data Name="SleepInProgress">false</Data>
<Data Name="PowerButtonTimestamp">0</Data>
</EventData>
</Event>

The problem occurs infrequently and it's impossible to say when it might reoccur because sometimes I can run my pc for hours on end for days without issue and then it happens.

I have scoured the net and this seems to be a widespread issue, but it also has to be one of the worst errors to troubleshoot because the symptoms are as diverse as the root issues which might be causing the error. Some people experience BSODs while others have hard hangs or have problems powering their system down properly. Pinpointing the root issue has proven to be damn near impossible for various reasons and it's much like a doctor treating a disease except every patient with the same disease exhibits different symptoms and possibly has a different virus (metaphorically speaking) which is making them sick. How do you treat something like that?

Microsoft has a knowledgebase on this issue, but it isn't helpful at all:
http://support.microsoft.com/kb/2028504

One of the things the MS kb points out as a possible cause is overheating, such as when you install a new cpu, but I've already thought of this. My cpu temps are always under control and I leave the side panel to my pc open so it can operate at the ambient room temperature (I run the AC in the house a lot during the summer). I've checked the heatsink to make it is installed properly and it's tightly mounted to the cpu retension bracket on the motherboard.

What happens when I experience one of these kernel-power reboots? I'll be using the computer, usually when the cpu is under light load like surfing the web, and I'll suddenly be staring at a black monitor screen. Before I know it the computer has already started to reboot itself. That's one of the weird circumstances surrounding the error is it never happens when the cpu is under heavy load like when I'm gaming for instance. In fact I can game for hours without problem; these kernel-power reboots happen infrequently when the cpu is under light load and now that I think about it the past four reboots in the last week and a half have happened when I was surfing the web. I know my cpu shouldn't be overheating when I'm surfng the web so really the overheating cause shouldn't apply in my case.

There are two big scenarios which make this so frustrating to troubleshoot:

[1] A minidump is never created when it happens. This limits my ability to analyze the circumstances surrounding the error.

[2] There isn't a lot of info on the net. Sure, the net is flooded with tons of posts with people having similar event id 41 problems for various reasons, but there are very few genuine solutions and I haven't found one that works yet.

I've contact AMD tech support and they are pretty much clueless. The tech at Antec has been a little more helpful and he thinks it might be the psu due to the periodic clicking noise. The psu should only click during power on/off, but never during normal operation so he suspects a faulty relay switch in the psu. Of course when I was running the 550 with the GeForce 250 the psu clicked back then and I never had one of these kernel-power reboots so why would the psu be an issue now?

A lot of people speculate that if the "bugcheckcode" is blank or has a value of 0 then it's due to faulty hardware. This makes sense since in my previous build I never had these event id 41 errors and this started happening only after I installed the upgrades so logically it's easy to deduce that one of the new components must be the culprit. The 965 was bought brand new, sealed in the box and while I did get the radeon 5770 used off of ebay I haven't had an issue with it yet. I've ran the cpu and the gpu both through diagnostics and haven't turned up anything.

Steps I have taken to troubleshoot this issue:

Checked the system for malware - some user on a forum suggested that there are viruses which can cause this error so I've done a full system scan with Microsoft Security Essentials and Trend Micro's housecall...nothing, no infections.

Checked Windows for errors - I've done the usual clean boot as well as safe mode. I've tried the sfc /scannow and windows resource protection didn't find any integrity violations. I've also done chkdsk on both of my hdd's as well as putting both drives though WD's Data Lifeguard Tools and HD Tune. Nothing.

Verfied drivers are up to date - according to someone the most common cause for an event id 41 is bad device drivers. All of my drivers are current*.

Diagnosed the RAM - Supposedly the most common faulty hardware for this error is bad ram so I did a Memtest86+ for twelve hours. No errors.

Updated the bios - Someone suggested that an event id 41 is a stepping error in the cpu and recommended upgrading the bios. Flashed the bios yesterday loading optimal defaults and still got a reboot today. Oddly enough the bios version is exactly the same after the flashing and I swear the only thing different is the build date.

Disabled devices in conflict - One user on another forum said he discovered that he had two audio drivers on the same internal high definition audio bus and he resolved his kernel-power issue by disabling one of them. Looking into device manager I found out that I also had two audio drivers on the same bus. One is the Realtek high definition audio (the integrated onboard audio chip) and the other is an AMD HDMI high definition audio which is included with the catalyst driver suite. Well, I don't use HDMI audio so I disabled the AMD audio driver and this seemed to work for a while, but it wasn't long before the reboots started happening again.

*As another user on the net stated that another big cause for this problem is incompatibility of older drivers. Well, my realtek audio driver is the most current one off of windows update, but realtek has released a newer driver on their website. I just installed the latest realtek audio driver tonight and still waiting to see if this work.

Finally, someone else stated they resolved their similar issue by bumping up their MCP, NB, and DRAM voltages slightly. I just went into the bios tonight and bumped all the voltages (Vcore, memory over voltage, nb) up a notch.

If I get another reboot in the next day or so I'm frankly out of ideas. I don't know what to do other than start RMA'ing parts and I was thinking it might be good to start off with the psu and the radeon first if I go down that route.

I'm still waiting to see if the latest realtek driver and the voltage increase will work and check whether or not the system will reboot in the next day or so, but in the meantime does anyone have any insight or ideas that may help me out?

I strongly encourage and appreciate all feedback cuz I'm about to lose my mind over this problem.

Neuromancer · May 22, 2011

Since your PSU is acting up and the error code is well, dated at best, my assumption would be that is the issue.

change PSU and post back

BoundByBlood · May 28, 2011

Well, I never really thought of my PSU as acting up. Ever since I bought it the unit has always made the occasional click or loud clunk sound, but it ran fine and didn't cause any problems so I thought nothing of it. Then I buy some upgrades and install them with a clean install of windows 7 and now I'm getting these spontaneous reboots. One thing that bugs me is if it truly is the PSU then why wasn't I getting these kernel-power reboots before I bought the upgrades?

I'm waiting to hear back from Antec about RMA'ing the psu.

the error code is well, dated at best

Please elaborate on what you mean.

Since the original post I've taken more steps to troubleshoot the issue. I cleaned the windows prefetch folder, I lowered the DRAM timings, and I disabled some unneccessary startup services (such as daemon tools lite loading at windows startup). No change and I've had three spontaneous reboots today, the most I've had in a single day yet.

I'm beginning to worry about the loss/corruption of hdd data if this keeps up. Looking in event viewer all the kernel-power critical errors are followed by another error event id 6008.

Here is the other error (following the most recent event id 41 reboot at 7:42:11 PM tonight)

Log Name: System

Source: EventLog

Event ID: 6008

Level: Error

Logged: 5/28/2011 7:47:45 PM

Task Category: None

Keywords: Classic

Computer: Dre-PC

The previous system shutdown at 7:42:11 PM on ‎5/‎28/‎2011 was unexpected.

- <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
- <System>
<Provider Name="EventLog" />
<EventID Qualifiers="32768">6008</EventID>
<Level>2</Level>
<Task>0</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2011-05-29T00:47:45.000000000Z" />
<EventRecordID>58872</EventRecordID>
<Channel>System</Channel>
<Computer>Dre-PC</Computer>
<Security />
</System>
- <EventData>
<Data>7:42:11 PM</Data>
<Data>‎5/‎28/‎2011</Data>
<Data />
<Data />
<Data>547</Data>
<Data />
<Data />
<Binary>DB07050006001C0013002A000B00FF02DB07050000001D0000002A000B00FF02600900003C000000010000006009000000000000B00400000100000000000000</Binary>
</EventData>
</Event>

Does this help any or provide any insight? I can't make heads or tails of this 6008 event.

TempliNocturnus · May 28, 2011

I'm a little reluctant to call this a hardware issue just yet. Could you export and post a copy of your system event logs containing these events? Due to the infrequent nature of these events, I'd suggest running another on Windows install for a length of time in which the event otherwise would have occurred.

Those event codes are very generic and would probably be generated if you cut the power to your system while Windows was running. A number of hardware and power issues could cause that behavior, but I've also seen software/driver issues cause similar issues; however hardware issues are typically very consistent, whereas software issues can often be less so, such as in your case.

If it is a hardware issue, the best thing you can do prior to RMAing anything is to make sure the problem is obvious. If the behavior cannot be easily reproduced in a lab, then expect your equipment to be returned to you as functional. I'd suggest running the OCCT Power Supply test overnight or so; this basically runs the other tests simultaneously to achieve maximum draw from your PSU. The idea is to basically accelerate total failure of a faulty piece of hardware, enabling you to locate and RMA it.

BoundByBlood · May 29, 2011

Thanks for the reply nocturnus.

Could you export and post a copy of your system event logs containing these events?

All the information I've posted here is the entirety of the information in the event logs. Unless of course you want me to gather together the various kernel-power event logs and post them, but I can't pretty much guarantee the information in the various system logs is the same. The info in any one specific kernel-power log is the same as the log I posted here, the only thing that is different is the log date. However, if you want me to gather all of them and post them I can do that.

I'd suggest running another on Windows install for a length of time in which the event otherwise would have occurred.

Running another what? The wording is a little confusing, but I'm guessing you want me to replicate the circumstances which produce the spontaneous reboots. Either that or run some kind of diagnostic on my windows install. Could you please clarify? Reproducing the circumstances for the errors is nearly impossible.

Everytime I've experienced one I've always been doing something on the net in my web browser. The error has never occurred while I've been gaming, watching movies, making dvd's, or have ran any other cpu/memory intensive application. Only when I'm surfing the web which is the weird part about it. It could randomly reboot while I'm typing this and I still wouldn't know what would have caused it to happen. I don't know what it is about my web browsing that causes the reboot so I really don't have a testing criteria to replicate the error.

A number of hardware and power issues could cause that behavior, but I've also seen software/driver issues cause similar issues

I've been under the suspicion it is software related. Looking through even viewer I saw a number of event logs a couple hours prior to the kernel-power which looked interesting. They were Application Error Event ID 1000 Task Category 100 where chrome.exe was the faulting application. I began to think to myself as far back as I can remember I've experienced these reboots when I'm surfing the web and I always use Google Chrome, but now I'm seeing application errors with the chrome executable prior to the kernel-power errors. I thought that was it, that this had to be the cause or somehow related to the reboots so I uninstalled google chrome and started using IExplorer again. It worked for a little while and then I got three reboots while using internet explorer today so whatever the problem is it's not specific to google chrome.

All the other software I have on my computer now is the same software I had on it before I upgraded. The only thing I know for certain is prior to upgrading I had never experienced any of these reboots and then I buy a 965 quad along with a radeon 5770 and now I'm getting them. The fact that the reboots didn't start happening until after upgrading along with the fact that the Stop Error BugcheckCode and PowerButtonTimestamp both have a value of "0" is why a lot of people on the net speculate it is hardware related. Still though it's only supposition and not definitive that it is a faulty hardware component.

It's late now, but tomorrow I'm going to run driver verifier to see if that turns anything up. If you can think of other software diagnostics which would be helpful (that are free) please let me know and I'll try them out too.

I'd suggest running the OCCT Power Supply test overnight or so; this basically runs the other tests simultaneously to achieve maximum draw from your PSU. The idea is to basically accelerate total failure of a faulty piece of hardware, enabling you to locate and RMA it.

I'll try all that when I get to the hardware side of troubleshooting; right now I'm still focused on exploring the possibility of it being software related. Just tell me what additional info you want and I'll provide it. As I've stated in a previous post a minidump is never created when it happens because I don't actually get a blue screen just an unexpected reboot.

BoundByBlood · May 29, 2011

I think I found the problem. After browsing through Performance Information and Tools I clicked on the link for view performance details in event log.

I found a number of these system events in event viewer:

Log Name: Microsoft-Windows-Diagnostics-Performance/Operational

Source: Diagnostics-Performance

Event ID: 501

Level: Warning

User: LOCAL SERVICE

OpCode: Video Memory Responsiveness

Logged: 5/29/2011 12:38:21 PM

Task Category: Desktop Window Manager Monitoring

Keywords: Event Log

Computer: Dre-PC

The Desktop Window Manager is experiencing heavy resource contention.
Reason : CPU resources are over-utilized.
Diagnosis : A sharp degradation in Desktop Window Manager responsiveness was observed.

- <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
- <System>
<Provider Name="Microsoft-Windows-Diagnostics-Performance" Guid="{CFC18EC0-96B1-4EBA-961B-622CAEE05B0A}" />
<EventID>501</EventID>
<Version>1</Version>
<Level>3</Level>
<Task>4006</Task>
<Opcode>42</Opcode>
<Keywords>0x8000000000010000</Keywords>
<TimeCreated SystemTime="2011-05-29T17:38:21.324954100Z" />
<EventRecordID>16</EventRecordID>
<Correlation ActivityID="{02EBEC40-F800-0000-2851-0B5E1B1ECC01}" />
<Execution ProcessID="1656" ThreadID="3164" />
<Channel>Microsoft-Windows-Diagnostics-Performance/Operational</Channel>
<Computer>Dre-PC</Computer>
<Security UserID="S-1-5-19" />
</System>
- <EventData>
<Data Name="Reason">1</Data>
<Data Name="Diagnosis">2</Data>
</EventData>
</Event>

From what I gather this is a display driver compatibility issue; of course it is unclear if the problem is the driver itself or windows resource management.

I'm going to uninstall the catalyst drivers, reboot into safe mode and run driver sweeper, and reboot normally downloading the ati display driver off of windows update. I'm going to see if the microsoft certified and digitally signed driver available from windows update will make a difference.

UPDATE: Nope, not the solution. After cleaning out the drivers and switching to the one from windows update I'm not getting anymore of these performance system events, but still got another reboot today. Frankly, I'm out of ideas.

BoundByBlood · May 30, 2011

This is pretty much a dead thread since I don't think neuromancer or templinocturnus are coming back.

I ran the OCCT Power Supply Test for 12 hours last night and it didn't fail or crash. So I'm assuming that means the hardware is good or at least stable. If a faulty hardware component was to blame it should have failed during the testing and I'm know not sure what else to do to test the hardware.

I'm the software side of troubleshooting I'm out of ideas since I've done everything I can think of and the problem still defies me. I thought I found the answer twice yesterday, once with the windows desktop management error and a second time with the microsoft antimalware service needed for security essential real time protection failing, but neither turned out to be the solution.

There's nothing more I can do software wise short of doing another clean install of windows.

TempliNocturnus · May 30, 2011

What I meant by my previous post was to do a clean install on another partition and run it to see whether it reboots; this would completely rule out hardware if it didn't.

An export of your entire event logs would still be nice; I was actually going to try to check for what you found in your previous reply. If you suspect the vidcard drive may be the fault, remove it and use the generic Windows graphics driver for a while. I've never used any driver cleaner programs so I'm not too sure what it did, however, if you go into your device manager and look at the driver tab for a given device, it tells you the vendor. You'd know whether you're using the generic driver if the vender says Microsoft.

I will typically completely remove a driver from a system by searching the %WinDir%\inf folder for a .inf file containing the Dev_ and Ven_ code for a device. You can find these codes under the Details tab, Hardware Ids, on the device properties. the code for an ATi 5800 series card looks like VEN_1002&DEV_6899. Simply search the inf directory for that code and make sure Windows search is set to search file contents. Once the inf file is found, delete it and uninstall the device. When reinstalled, it will either prompt you for a driver or use a generic driver that matches the class ID.

BoundByBlood · May 30, 2011

Hi again TempliNocturnus.

An export of all system events? I can do that. I'll export them, zip 'em up, and attach them to a post.

Well, I may have found the solution this time. I'm not going to be premature and say iI figured it out since it's still in a test phase (if I go a whole week without a reboot I'll know it's solved), but I'm about 85%+ sure this time.

Looking into event viewer I noticed a pattern. Just a few minutes prior to the latest kernel-power error I spotted a Microsoft Antimalware Event ID 3002 error. I use microsoft security essential and the dependent service needed for real time protection was crashing and only a few minutes prior to the kernel-power error. I scrolled through event viewer and sure enough just before every kernel-power error I have I saw the same antimalware 3002 error logged minutes prior to the log time for the unexpected reboots. I tried uninstalling and reinstalling security essential, but I still got a reboot. So, I uninstalled security essentials and I haven't seen an antimalware 3002 error or a reboot since.

Now I'm in a test phase to evaluate my hypothesis of whether or not the real time protection in security essentials and the windows service needed to run it are to blame. I'll know by next weekend if it is indeed the solution. Going eight hours strong without a spontaneous reboot so we will see.

If you still want the system events I can export them as .evtx, .xml, .txt, or .csv so tell me your preference and I'll get them. I don't recommend the text file because it comes out jumbled and is hard to read, but whatever is easiest for you I'll get.

Adak · May 30, 2011

Your approach to troubleshooting this is rather -- troubling.

PSU's do not, at any interval, make a loud clicking noise. They do not make ANY clicking noise, and have no parts that should make one.

But you've zipped right past the obvious, and into all manner of everything else that could be the problem, ignoring the big elephant in your living room of troubles.

CPU's are run on clock sweeps. If the power isn't being fed into the chips that produce those sweeps, in just the right way, those sweeps will become less, and it may appear that your cpu is being "over utilitized". Make sense?

If MS Security software was making systems reboot, I can assure you, you'd hear it screamed from 100 Million rooftops, nightly. I've used it, and it never does. What it DOES do, is put a higher demand on your system, and on your PSU. < hint, hint > You should use it, and other programs that will put a heavy demand on your system, as you troubleshoot this problem.

PSU's that are acting strange, are nothing to be toyed with. Not only can they quit, they can also take out your mobo and cpu, in one fell swoop, and also become a real fire hazard, shooting out sparks and or a very smelly and unhealthy smoke. I'm not saying the PSU is the problem here, only further testing will find it, but your PSU should be replaced, no matter what else you find.

My advice is to quit trying to work with the software and start isolating the hardware. Software can't always find hardware problems, because hardware runs at a lower level. Your problem is hardware related imo, and not software. Software problems typically cause the application running to act up or quit, but they won't cause the system to reboot, 98% of the time, because the applications all have their own isolated workspace, and limited access to the OS and other internals.

So a test would be to first, find the exact circumstance where you can initiate a failure on the system, relatively quickly. Yes, you WANT it to be set up so it fails, and quickly.

Now repeat that test, with one factor changed - perhaps start with another PSU that has equal or more total power, and is known to be good.

Does it fail now, under the same test?

Continue to isolate one factor at a time, until you've isolated the problem, OR found none of the hardware factors, were at fault. The problems that are hardest to deal with, are the intermittent problems that can't easily be initiated.

But get rid of the clicking PSU. That is a hazard. One thing humans are not real good at, is limiting their risk. We "like" a stock pick, we "love" our homes, cars, computers, and even our power supplies. We "know" they would never lose a lot of money, or hurt us. Too bad these inanimate objects, don't feel that way. IMO, it's time to limit your risk with that PSU, regardless of anything else.

BoundByBlood · May 30, 2011

Your approach to troubleshooting this is rather -- troubling.

I'm sorry you feel that way.

PSU's do not, at any interval, make a loud clicking noise.

They do when they are cut on and off. Depending on the cooling system you may not notice it, but if you listen closely you might hear it sometime. Every psu I have owned always has made the click when I power it up or after it cuts off when I shut the system down.

They do not make ANY clicking noise, and have no parts that should make one.

They have a relay switch for powering the fan on or off inside the unit. The clicking, or maybe more of a "clunk" as the antec tech descibed it, is the sound of the switch flipping over and cutting on the fan. It's normal to hear the noise when the system first powers up or when it is shut down, but it should never be heard during normal operation. Some are softer than others.

But you've zipped right past the obvious, and into all manner of everything else that could be the problem, ignoring the big elephant in your living room of troubles.

I didn't zip past it, a failing psu was one of my first suspicions. In fact Antec's tech support were the first people I contacted. They are currently working on a RMA. Or they were, the status of my case with them has changed and is pending at the moment.

CPU's are run on clock sweeps. If the power isn't being fed into the chips that produce those sweeps, in just the right way, those sweeps will become less, and it may appear that your cpu is being "over utilitized". Make sense?

Indeed it does. That issue about the desktop window manager being under heavy contention and the cpu being over utilized has been resolved. I looked at the dates and remembered that was during the time I was running driver verifier on the system checking the integrity of all installed drivers. Since then I did clean install of the latest catalyst drivers and that particular error hasn't populated event viewer again. Problem solved. That's one bird down.

PSU's that are acting strange, are nothing to be toyed with. Not only can they quit, they can also take out your mobo and cpu, in one fell swoop

I've had that happen to me before with my previous psu before the Antec I have now. My previous psu was a Rosewill...I know, I know, don't give me flak for it please :eh?:

.

If MS Security software was making systems reboot, I can assure you, you'd hear it screamed from 100 Million rooftops, nightly.

Kernel-power event id 41 errors were screamed from lots of rooftops. The issue is about a year old now and isn't as apparent as when windows 7 was released, but a lot of people on the net never did find a solution to their spontaneous reboots. It's quite possible that MS security software or other antivirus apps were to blame but the people never figured it out because they might have had limited computer knowledge or they just didn't spend enough time troubleshooting it.

My advice is to quit trying to work with the software and start isolating the hardware.

I've been troubleshooting both simultaneously. I've been running Memtest to verify the ram, chkdsk /r as well as putting the hdd through HDTune and have been doing a few other diagnostics on the hardware. I've run the OCCT Power Supply Test for two 12 hour intervals now putting maximum draw on the psu trying to initiate an accelerated failure of a faulty component as TempliNocturnus suggested if indeed there is a bad component in the system. Even with max draw on the psu and the cpu and gpu stressed tested to the max the comp did not crash, fail, or reboot. Even after running OCCT for a twelve interval I was able to game for a few hours and watch a movie as well as burn a dvd without a problem.

So you see how peculiar this problem is right?

Ever since I uninstalled the security essentials my computer has been running right the way it should be. I've using it all day today and haven't had a single reboot. I had three spontaneous reboots yesterday when it was installed. So what does that tell you logically? I'm going on 14 hours strong without a reboot since uninstalling security essentials.

It does get a little weirder because as you stated MS security software puts a higher demand on the psu. If we assume the psu is to blame then why wouldn't it fail under maximum draw during the OCCT test?

Your problem is hardware related imo

I'm not so convinced of that anymore. I'm not saying it's not possible, but I believe it to be a software issue and not having a reboot since removing security essentials is reaffirming my suspicions.

Software problems typically cause the application running to act up or quit, but they won't cause the system to reboot, 98% of the time, because the applications all have their own isolated workspace, and limited access to the OS and other internals.

It wasn't the security essentials app itself that was the problem but the windows service, microsoft antimalware, which is needed for real time protection. If you looked through my event viewer you would see that minutes prior to every kernel-power error logged there was an error for ms antimalware crapping out. Every time the system reboots an EventLog Event ID 6008 is created and it states the time of the spontaneous reboot (the time logged in the kernel-power error is not the actual time of the reboot but the time the system logs the error). Well looking at the times for the 6008's the ms antimalware crapping out was always minutes prior and I mean mere minutes. That's heavy laden evidence in itself.

So a test would be to first, find the exact circumstance where you can initiate a failure on the system, relatively quickly. Yes, you WANT it to be set up so it fails, and quickly.

Now repeat that test, with one factor changed - perhaps start with another PSU that has equal or more total power, and is known to be good.

Does it fail now, under the same test?

Continue to isolate one factor at a time, until you've isolated the problem, OR found none of the hardware factors, were at fault.

A good suggestion, the only problem is I don't have the money or expendable parts for that kind of hardware testing. I do have the money to replace a component if it is indeed faulty, but I don't have spare parts that I can swap out for that kind of testing such as another processor, a different vid card, different ram modules, etc. I've gotta work with what I have which is all I have.

But get rid of the clicking PSU.

Working on it. The tech at Antec, his name is Peter, isn't so convinced anymore that the psu is to blame. He elevated my case to his superior and I'm waiting to hear back.

TempliNocturnus · May 31, 2011

The reason I wanted to see your event logs was to see whether I could spot anything similar to what you spotted: any trending events that always occur prior to the restarts. I also use event logs to establish a baseline time in which this event would typically occur. You seem to have done both: you found the MS antimalware service error prior to the reboots and you've determined that your machine would typically have restarted within a 1 week time span.

I typically like to rule out hardware as an issue right off the bat, but that's often a bit difficult on an online forum where I cannot touch the hardware. The quickest and most effective way of determining whether a hardware fault exists when the OS is still bootable is by running a different OS install and observing whether the problem still occurs; which I suggested previously. At this point however, I would re-suggest that course of action only if the problem continues; it seems the OP may be on to something.

OP, if you'd still like to upload your logs, I'd prefer them in evtx format; that way I can open it up with my event viewer. I may request your logs again after your week test period, so you may hold off posting them until you receive an unexpected reboot; if it happens.

BoundByBlood · May 31, 2011

Well, it appears I was wrong. I got another unexpected reboot tonight and looking into event viewer there isn't a trend of an app or service failing prior to the unexpected restart. At this point I've done all I can do software wise other than do a clean install of windows.

I could create a small 10 gig partition and do another install of windows like you suggested nocturnus. If it is software related then the problem has to be with the os installation and in terms of software troubleshooting the only option I have left is do what you suggested. Of course if I did do that I wouldn't know how to replicate the problem because I just don't know what causes it. I've ran system file checker on the os, but it didn't detect any problems.

If it's hardware then it has to be the psu. The processor was brand new in the box and the vid card was refurbished. I believe both of them to be good since they're not giving me problems otherwise and they passed OCCT's stress testing. Of course if it is the psu I just can't understand why OCCT wouldn't accelerate a failure of the unit. It should've have caused a reboot if nothing else if the unit is faulty, but it didn't. I'm just mind boggled.

Here is a copy of my system events nocturnus. If you have any insight or see something I might have missed please let me know. Oh, and you can ignore all those errors about the bad hard drive sector. My secondary hard drive was failing and I've RMA'd it. Primary drive is still good, but won't be for much longer if I don't solve the reboot mystery.

I appreciate your help thus far and any future feedback you may continue to provided.

WeThePpl · May 31, 2011

Hey! I had this event error ID. Try running this command through the cmd - sfc /scannow.

You may need to post the results, but it will save a text file when it is complete. I had this same error and my PSU died a week later http://www.overclockers.com/forums/showthread.php?t=677755 that is my post from it. I do not think your problem is the same as mine, though. Have you tried another PSU? Also, does the system overheat? If it does then the computer may bluescreen and log it as event id 41 since the computer was not properly shut off. If you force shut down your computer via the power button, it will also present you with this same error. It relates to many different symptoms from AVG antivirus issues to a corrupt file system. Let me know how that scan runs!

BoundByBlood · May 31, 2011

Hey!

Hi.

Try running this command through the cmd - sfc /scannow.

I have done all kinds of diagnostics under the assumption it is software related. I have run the system file checker multiple times.

Windows resource protection did not detect any integrity violations.

You may need to post the results, but it will save a text file when it is complete.

It will only create a txt file if it finds a problem. In my case since it doesn't come across any issues a txt file is not generated.

I had this same error and my PSU died a week later http://www.overclockers.com/forums/s...d.php?t=677755 that is my post from it.

Holy beejesus those pics are huge.

I do not think your problem is the same as mine, though.

I think it's similar as in my psu is failing just like how yours went bad.

Have you tried another PSU?

Don't have a spare one to test. I just got my RMA approved by Antec so I'm sending it off tomorrow.

Also, does the system overheat?

No. I've ran OCCT's (Overclock Checking Tool) power supply test for two 12 hour intervals. The app runs three tests simultaneously; it stress tests the cpu and gpu concurently while pulling the maximum power draw from the psu. During the testing the cpu was between 67-69*C and if overheating was the problem then it would have forced a crash or another spontaneous reboot with the cpu being stressed at those temps for that long, but it didn't happen. The comp went through both OCCT tests without failing. I don't know much about what is going on with the problem, but I do know it's not because of overheating.

If it does then the computer may bluescreen and log it as event id 41 since the computer was not properly shut off.

I'm not getting blue screens when it restarts hence no minidump files either.

If you force shut down your computer via the power button, it will also present you with this same error.

I'm not forcing a hard shutdown via the power button. It just spontaneously reboots on its own. It goes to black screen one second and the system is restarting the next.

It relates to many different symptoms from AVG antivirus issues to a corrupt file system.

I'm beginning to think a corrupt file system. At this point a bad windows installation is the only thing that makes the slightest bit of sense (aside from a bad piece of hardware possibly).

I remembered that a while back I had a startup problem with windows and boot manager was missing. I had to boot from the windows cd and in a windows recovery environment I had to fix the master boot record as well as rebuild the bcd. Now that I think about it whatever caused that problem may have corrupted the file system, particularly the windows kernel mode. Yet, if that were the case sfc should have found a problem. Last night I booted from the windows cd to go into the recovery environment so I could attempt some repairs and it listed my installation as "Windows 7 x64 (recovered)". The recovered in parenthese might be a clue to a bad file system.

BoundByBlood · Jun 1, 2011

I still haven't figured it out, but I'm making progress.

Under normal circumstances I only get a reboot after extended use of the computer. For instance if I leave the house I will setup the computer to download while I'm gone. When I come home when the computer has been on for a few hours I'll game for a bit and watch a movie without issue. If after watching a movie I want to get on the net this is when I experience the problem. This is the weird aspect of the problem, the computer reboots when I'm in a web browser. I can download for hours, game, and watch movies, but when I get into a web browser after extended use is when the spontaneous and ungraceful happens.

An AMD tech contacted me with a couple of suggestions. First, he said I should try turning Cool 'N Quiet off in the bios. I did disable Cool 'N Quiet and it accelerated the frequency of the reboots. While it was off my computer was essentially unusable. I would log into windows, load up internet and explorer and after browsing the net for 5-10 minutes (sometimes it would only be two minutes) I would get a reboot. I got caught up in a cycle of reboots. I took out the 5770 card and hooked up the onboard video to see if that would make a difference. Still got the reboots.

I re-enabled Cool 'N Quiet and the constant rebooting stopped. I'm still getting the reboots after extended use under normal circumstances listed above, but not nearly as often when Cool 'N Quiet is off. I'm still trying to figure out what this means and whether it relates to hardware or software, but this is the biggest clue I've had yet. The problem is a layer cake and I'm still not in the middle yet, but this means something. I know windows 7 has a cool 'n quiet driver so maybe it is corrupt. I think this is pointing towards a bad file system and I'm still leaning towards a corrupt windows installation.

I read somewhere on the net in order to get good cool 'n quiet functionality you first have to set your computer to the power saver plan with the minimum processor state at 5% and maximum processor state at 100% under the processor power management subheading in advance power options. I've done that. Then they said to enable C1E support and DRAM power down in the bios. I turned on C1E, but apparently my bios doesn't have the dram power down option or I don't know where the option for it is in the bios.

We will see what the AMD has to say about disabling Cool 'N Quiet accelerates the spontaneous reboots when he responds.

TempliNocturnus · Jun 2, 2011

Wierd, but at least you now have a way to accelerate the occurance of the issue. At this point, I would still be focused on completely ruling out software/driver issues. If anything, you can try booting up to a live linux CD and browsing the webl with the Cool n' Quiet settings disabled in BIOS and see whether results are consistent with the behavior you're experiencing in Windows.

If the results are consistent, then you definately have a hardware problem. It's looking like a faulty sensor or corrupt code in your BIOS could possibly be to blame. To rule this out, you could try flashing your BIOS (this is a bit risky if your system decides to reboot out of the blue). It's also feasible that the PSU could be to blame, however I'd lean more towards the motherboard, as I've actually seen a very similar issue on newer Dells.

If I suspect that either the PSU or motherboard could be the problem, I will typically always replace the PSU first, simply because it take less effort. If you do not have access to a bench PSU, you could go to a local store and buy one and return it after you're done. Once accomplished, you will have either ruled out the PSU completely or determined that you need a new PSU.

BoundByBlood · Jun 20, 2011

I honestly forgot about this thread.

In the 19 days since my last post I haven't had a single reboot. All I did was adjust the power plan and enable C1E support for Cool 'N Quiet in bios. For all intent and purposes it worked and my comp has been running fine. I can even leave it running all night to download and when I check it in the morning it's fine and still no reboot. The only thing I can figure is something was screwy with Cool 'N Quiet and the aforementioned fixes resolved the problem.

I appreciate the feedback and the help. I'm labeling this problem as solved.

WiZiD · Nov 30, 2012

After searching the internet for the same problem I found a setting in my Bios called DRAM Over-Current Protection: Sets a trip threshold to shut down the VRM if excessive current is drawn from the VDIMM VRM. Set to disabled if overclocking with 4GB DIMMs at speeds over DDR3-2000.
After setting this to disable it resolved my problem. Just throwing this out there in case someone picks up this old thread. Crosshair V formula

redduc900 · Nov 30, 2012

Good to know... thanks.

SOLVED Kernel-Power Event ID 41 Critical Error

Maybe Something Cool?

Member

Maybe Something Cool?

Member

Maybe Something Cool?

Maybe Something Cool?

Maybe Something Cool?

Member

Maybe Something Cool?

Senior Member

Maybe Something Cool?

Member

Maybe Something Cool?

Attachments

Registered

Maybe Something Cool?

Maybe Something Cool?

Member

Maybe Something Cool?

New Member

Inactive Moderator

Similar threads