View Full Version : Quit 101 - Fatal error: Box exploding.
davidnjina
01-17-07, 10:37 AM
I've been getting this message on one of my machine from time to time. I know it says to check the stability of the machine, but this one is my work computer and I don't think it's OC'ed.
Checking with Dell, this machine has "PROCESSOR, 80547, PENTIUM 4 PRESCOTT DT, 650, SKT-T" clocked at 3.4Ghz (according to system properties).
The followings are from the FAHlog-Prev.txt.
[19:38:46] Completed 20000000 out of 20000000 steps (100)
[19:39:49] Folding@home Core Shutdown: FINISHED_UNIT
[19:39:52] CoreStatus = 64 (100)
[19:39:52] Sending work to server
Above means that it finished one WU, right?
The machine finished two WUs, then 8 of "Quit 101 - Fatal error: Box exploding." messages.
Then it finished a WU, and another "Quit 101 - Fatal error: Box exploding." message.
Do you guys think this is a stability problem? Do you have any idea what I can try? I don't know what to do because this one is a Dell machine and I didn't do anything to cause instability in the system.
Is there a free and reliable software to monitor and log cpu temperature?
Thanks.
I've been getting this message on one of my machine from time to time. I know it says to check the stability of the machine, but this one is my work computer and I don't think it's OC'ed.
Checking with Dell, this machine has "PROCESSOR, 80547, PENTIUM 4 PRESCOTT DT, 650, SKT-T" clocked at 3.4Ghz (according to system properties).
The followings are from the FAHlog-Prev.txt.
[19:38:46] Completed 20000000 out of 20000000 steps (100)
[19:39:49] Folding@home Core Shutdown: FINISHED_UNIT
[19:39:52] CoreStatus = 64 (100)
[19:39:52] Sending work to server
Above means that it finished one WU, right?
The machine finished two WUs, then 8 of "Quit 101 - Fatal error: Box exploding." messages.
Then it finished a WU, and another "Quit 101 - Fatal error: Box exploding." message.
Do you guys think this is a stability problem? Do you have any idea what I can try? I don't know what to do because this one is a Dell machine and I didn't do anything to cause instability in the system.
Is there a free and reliable software to monitor and log cpu temperature?
Thanks.
Yes David, core status 64(100) does mean the WU has finished properly.
All the temp monitoring programs read data from the mobo itself. Your mobo may or may not have the sensors in the same place, with the same values, as other motherboards.
Although Speedfan is mentioned, I would call Dell and see what they recommend for this, first. Nobody knows a Dell better than Dell - hopefully.
Have you checked the insides for a buildup of dust, hair and dirt? Some of my SMP WU's do give me this message - the atoms have moved further apart in the simulation than is possible in real life. I haven't seen more than 3 in a row (retries of the same WU), however.
Good luck, David.
Adak
leelegend
01-17-07, 03:56 PM
i have only seen that on a SMP unit, and that was ages ago when the client on the mac had only just been released
lee
Shelnutt2
01-17-07, 04:57 PM
rofl!
Box exploding..not really the best error messages is it? :p
davidnjina
01-17-07, 05:08 PM
:mad: This happened again just now at 99% completed. Doh!
[22:55:51] Writing local files
[22:55:51] Completed 49500 out of 50000 steps (99)
[22:59:56] Quit 101 - Fatal error: Box exploding.
[22:59:56]
[22:59:56] Simulation instability has been encountered. The run has entered a
[22:59:56] state from which no further progress can be made.
[22:59:56] This may be the correct result of the simulation, however if you
I am going to rest this machine for a bit as I need to run some access querries on over 1 million records.
jws2346
01-17-07, 05:57 PM
:mad: This happened again just now at 99% completed. Doh!
[22:55:51] Writing local files
[22:55:51] Completed 49500 out of 50000 steps (99)
[22:59:56] Quit 101 - Fatal error: Box exploding.
[22:59:56]
[22:59:56] Simulation instability has been encountered. The run has entered a
[22:59:56] state from which no further progress can be made.
[22:59:56] This may be the correct result of the simulation, however if you
I am going to rest this machine for a bit as I need to run some access querries on over 1 million records.
Eekers, I've never seen that message "Box exploding" (knock on wood) PLEASE, PLEASE post the remedy. :confused:
Leviathan41
01-17-07, 06:10 PM
I am having the same problem with my P4C 2.8GHz machine, but only on a certain work unit. Every time I receive p3302_ribcompHT, I get the Exploding Box error. However, I just finished folding a patty melt without any error. I even ran Prime95 for 40 hours while it was folding the patty melt, so I don't think the computer is unstable. Nothing is overclocked and I even increased the voltage slightly over the default to see if that helped.
It's a strange error, but it only happens on that work unit (which is fahcore_79). I just got a fahcore_80 WU (p2904_BBA5) so I am curious to see if it completes correctly like the patty melt (fahcore_78).
Here is the error message that I am getting:
[18:45:28] Quit 101 - Fatal error: Box exploding.
[18:45:28]
[18:45:28] Simulation instability has been encountered. The run has entered a
[18:45:28] state from which no further progress can be made.
[18:45:28] This may be the correct result of the simulation, however if you
[18:45:28] often see other project units terminating early like this
[18:45:28] too, you may wish to check the stability of your computer (issues
[18:45:28] such as high temperature, overclocking, etc.).
[18:45:28] Going to send back what have done.
[18:45:28] logfile size: 8455
[18:45:28] - Writing 9008 bytes of core data to disk...
[18:45:28] ... Done.
[18:45:28]
[18:45:28] Folding@home Core Shutdown: EARLY_UNIT_END
[18:45:30] CoreStatus = 72 (114)
If the rig is stable, it's a WU fault - I've had a few. See them on the same WU project, in groups.
The problem lies with the WU itself, and there's nothing you can do about it IF your rig is stable.
Adak
I had a couple of these "box exploding" messages yesterday on my work machine (also a Dell with a Prescott), but they were from similar work units so I figured the problem was with the work units, not the machine. I'll check it again when I get to work this morning.
sean uk
01-18-07, 12:32 PM
hi
i've had that message once when i first started with the smp core can't remeber which unit but it re-run the same wu aand finished ok
Leviathan41
01-18-07, 06:57 PM
I just got a fahcore_80 WU (p2904_BBA5) so I am curious to see if it completes correctly like the patty melt (fahcore_78).
This WU completed ok so it looks like the problem is with that specific WU (p3302_ribcompHT).
@davidninja: do you know what WU you are having the problem on? Is it only one WU or is it more than one?
EDIT: I just found this thread: http://forum.folding-community.org/viewtopic.php?t=17474&postdays=0&postorder=asc&start=0 and it sounds like there is a problem with this WU and similar WUs (p3301). Hopefully they will stop sending these out soon, I just got another that exploded after 10 frames.
I would report the protiens with their logs over at -> http://forum.folding-community.org/viewforum.php?selected_id=f10
If you post the log use the ..... tags aound it.
As a minimum, post up your project run/clone/gen info and the mods can look up the wu's to see if others who get reassigned those wu's completed them sucessfully or not.
Heat buildup from blocked airflow, dust etc as suggested is the first thing to check when you get a string of wu's that error.
When you see:
[19:39:52] CoreStatus = 64 (100)
[19:39:52] Sending work to server
it is a good thing since the error gets reported back.
When you get errors like file i/o error the wu gets discarded and nothing gets reported. These types are almost always caused by your rigs instability.
It is good to report these ones as well if you are unsure, with the log to help stanford diagnose issues.
Some of the wu's error our 'naturally' - the wu is a simulation, and if the wrong starting point is given for the simulation, this can happen but mostly in beta or to a lesser extent when running beta. Shortcuts are also taken in the simulation to make them computationally feasible, Stanford usually targets to have the 'natual' failure rate at a few % of the total.
Sometimes malformed projects do get out thru beta and -advmethods to the 'public' so it is important to report them at the FAC forum so stanford can stop the project and correct it as early as possible.
davidnjina
01-19-07, 02:47 AM
This WU completed ok so it looks like the problem is with that specific WU (p3302_ribcompHT).
@davidninja: do you know what WU you are having the problem on? Is it only one WU or is it more than one?
EDIT: I just found this thread: http://forum.folding-community.org/viewtopic.php?t=17474&postdays=0&postorder=asc&start=0 and it sounds like there is a problem with this WU and similar WUs (p3301). Hopefully they will stop sending these out soon, I just got another that exploded after 10 frames.
Yep! That particular WU gave me trouble eight times.
[04:23:17] Protein: p3302_ribcompHT
vBulletin® v3.8.7, Copyright ©2000-2012, vBulletin Solutions, Inc.