• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

UG! Farm is failing.

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

don256us

Uber Folding Senior
Joined
Jul 17, 2003
It may not be as bad as I think but I am losing equipment again. I will not be replacing a bunch of it if it is down for good, at least not right away. I have one machine that wants to shut down all of the time. I have one machine that has four cards but is only earning the same ppd as a single card. Fans have been failing and I have been taking them apart and putting working pieces together giving me a Frankenstein's monster look. (Not a Frankenstein look as Frankenstein was the Dr.)

I will work on getting the working pieces back up and running and salvage any points that I can. After we pay for this years house projects, I will start to look at replacement parts to bring my earning power back up. My replacement farm will most likely consist of new parts instead of old bit coin stuff. I will most certainly be looking for stuff that uses less power too. We'll see.

Most certainly, when the time comes, I will check in with you, my brethren, to see what I should be looking at for a low cost high point solution to my current woes.

Until then...

:salute:

Edit: Another issue that prevented me from troubleshooting the farm is that my network had a persistant failure. After a few hours troubleshooting, I think I found the issue. I have a cable modem, wireless gigabit router, and rack mount 24-port gigabit switch. After monkeying around with that, I think that its my other router that I configured to be an access point. Tonight I should be able to put some of my farm back on line when I get home.
 
Sounds like a lot of work. Please let us know what parts you need. If I have what you need I'd be willing to send some stuff your way.
 
Thanks guys. What I'm looking to do short term is to consolidate the working parts that I have and reduce the number of hosts that are pulling power from the wall. This will work out as I need to reduce production for the summer anyway. Once winter starts to return, I'll look at my options and see where I go.
 
I hit a bit of a snag but I should be able to work my way out of it in the next few days. I moved a card to my workstation rig. However, since the machine was only producing 130k ppd with four 7970's, I decided to reload the machine from scratch. I got the base OS installed last night and started the update process. I also moved another card to my Windows 10 machine but that machine wants to stop folding all of the time. I will move all three of those cards to the Windows 7 machine that I vacated for the summer and see where that puts me.

Unfortunately, due to an abundance of rain and an over abundance of pine needles in my gutters, I may not get to the machines tonight. I should be able to boost production by tomorrow night at the least.
 
So if any of you are keeping score, my ppd dropped by about 1 million down to 500k. I had gotten my two main rigs back up and running and was feeling pretty good about it. The next day, I looked at my EOC stats and I saw that I had turned in 80 work units (wu) on one update and over 40 on the next. That isn't right because I usually only submit something like 7 wu per update. When I got back to my farm, I saw that one machine had gotten its original problem back. I shut it down in disgust. My #1 main folding rig, with 5 working cards showed all 5 slots as "failed". I rebooted and everything was looking fine. I sat and watched this machine for a bit and then started to see the issue at work. All five slots start working, reach 0%, then download and do it again. Each slot in its own time does this over and over. Hence 80 wu on a single update. I was so infuriated that I shut the machine down and turned off the power bars that supply electricity.

I will get back to it once I get my summer household projects under control. New windows this year, getting the gable ends of the house painted to match the new roof we did last year, and a power transfer box for my portable generator. I think that I will post a small work log on that one. It will allow me to power the furnace, fridge, TV, basic network and a few lights with my portable generator. Or, conversely, the electric water heater if I turn everything else off. The water heater takes like 4500 watts and the generator is rated to 5500 so.....
 
Last edited:
I brought up my workstation rig with 4 cards running. It is doing fine except...

Four cards are only earning ~149 k ppd. Not each but TOTAL! That ain't right. That's why I rebuild the rig from scratch. Now I have to figure out why the points are so low. It's not the OS or video driver as those have been installed from scratch. I think that I will try to move my hardware around to another machine and see if I can maintain my points for the summer.
 
well, perhaps with the release of NVidia pascal it's time to move from the 280x cards to pascal or Maxwell.
with this release there should be a few 980 or 980TI cards hitting the used market.
four 980's in a single board should be easier to maintain, use less power, make less heat and keep you at 1.5 million ppd.
the Maxwell release being not so long ago should mean that these cards are newer and might have more points left in them than the older amd cards.
to my mind, with such a productive farm as yours, water cooling would be in the back of my mind, I should think that a single 4x180 or 9x120, single loop might add to the cards lives and, overall reduce your time investment on the maintenance side of things.
 
Agreed. You should still get a descent price for those used tahiti cards since they will STILL be the dual precision champs. They very well for the BOINC crowd.
 
I pulled four cards from my "Workstation" rig. I was only getting ~140k ppd from it. I put in one card and was getting 3k. I moved it to another slot and got 140k on its own. I put in a second card and only got around 170k before I stopped watching it and went to bed. I will look at it soon tonight and try other things. With two 7970/280x I should be closer to 280k +. Little by little I will figure out what works and what to toss.
 
You should use my handy-dandy program! :thup:

Yep. Will do.

Unfortunately, when I got home, the machine had shut down completely. After a round of loud cursing, I took out the second card and installed a different second card. Points were hovering at 150k for two cards. Very much NOT what I wanted. We'll look again tonight.
 
Late reply. Just read all the issues. Something clearly wrong with that PC. Longshot, but maybe a PSU issue? Could be driver, but that seems unlikely if it hasn't changed. Are you reinstalling FAH after swapping cards? I always seem to have issues when I change cards and just reinstall each time.

And I hear you with other projects. I have a similar generator project and also just had to clean gutters!
 
The PSU is a monster 1600 watt that I got at a song of a price from another team member. More testing and here is what I think I have going on.

One card is seen by Windows as Standard VGA. FAH sees it as the 7970 that it is yet with that card installed, all cards suffer a great point loss. That card is done.
A second 7970 started acting up. Once it gets 100% GPU load, the fan kicks up to 100% (loud) and the temp rises to over 100c. It gets so hot that it shuts down the PC. That card is done.

As of last night, I have two frankinstien cards running and I seem to be getting points. I will pull the coolers from those other two cards and see if I can fix other cards with poor/no cooling. I will run each and every one of my 11 (At the height of my production) cards into the grave. Then I will begin to replace.
 
That almost sounds like a driver issue (or a failing GPU). I wish I had more experience...or any experience...with GPU folding to lend a hand.

Just curious (and folks let's not go crazy here) but what are you using for power filtration? Any chance this could be a dirty power problem? In the past I experienced major system issues (on multiple systems) and was pounding my head against the wall trying to figure out what the problem was. I tried everything; RAM, CPU, mobo, case, PSU, HD, wiping/re-installing the OS, etc. Nothing worked, and the issue would change from one troubleshooting session to the next. I thought I was starting to lose it.

What was the cause? Bad power. Even though my voltmeter was reading a "normal" voltage (within range) when I tested it, it wasn't able to show me the millisecond duration dips/spikes that were coming from the power source. Once I corrected that issue, all of my (computer related, haha) problems disappeared into thin air. It was educational for me, given my background in IT, that something so 'simple' could cause such a headache with the equipment.

The downside? Now I'm a little OCD when it comes to power filtration and run professional grade gear, because I'm forever paranoid that my power source isn't clean. :(

OK -- I'm hopping off the soapbox now. I hope you're able to cycle through the cards and get shiney new replacements. Er...but only after I get mine, first. :p
 
Interesting. I would of thought a descent PSU would regulate any kind of spikes. MAybe the tolerances are tighter for CPU voltages as compared to GPU voltages. Still, good info to know.
 
For me, the issue is most certainly not power related. I have two rack mounted surge protectors each with a dedicated 20 service from the distribution panel. The power supply for this rig is a well known name brand but I can't recall as I write this.

No, for me the issue is old, used and put away wet hardware that has just reached it's life span and is showing it in many ways.
 
Got it! :) Worth asking, but it definitely sounds like you've got that part covered.

I have two rack mounted surge protectors each with a dedicated 20 service from the distribution panel.
Sweet!!! Got any pics you can share? Pretty please?? :p I'm always curious how folks have their farms setup.
 
I do but my work network won't let me do it. I have work after work so hopefully I can upload them tonight.
 
Back