• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

Learning How to Set Up F@H

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

MattNo5ss

5up3r m0d3r4t0r
Joined
Aug 11, 2008
Quick question, do I need to setup any flag for the GTX580? I noticed it said use "-forcegpu nvidia_fermi" for GTX400 series, but nothing is said about GTX500 cards although they are Fermi as well. It doesn't look like my GPU is being used much, it'll jump to ~60% load for a couple mins, then back to 0% for a few minutes.

Above is my previous question from the New Member thread. Basically, what I'm doing is trying to learn how to set the clients up so I can set up other systems for more continuous F@H use than my 24/7 system.

If you are receiving fahcore15 then I would say you are good to go. I do not use that flag on my 460 or 465 cards and all is good. However, if this is an sli setup you might add this flag to the 2nd and subsequent cores (cards) just for good measure. Are you vmware player folding? Vmware priority is a little too high by default for smp folding alongside gpu and must be adjusted. I was able to achieve about 90% utilization on my gpu when folding alongside smp with a few tweaks.

I'm getting both FahCore_a3 and FahCore_15. I'm just using one GPU and I'm doing this inside of Win7 Pro x64.

GPU utilization should be in the high 90% range continuously, if running correctly. Task manager should show FAHCore_15 process using 3% or thereabouts of the cpu. Perhaps you should start a new thread and post up the log, client.cfg, and give us a bit more info like client version, OS, driver version (not all work). We'll get you going.

The GPU definitely isn't at 90% continuously. I have FahCore_15 using 1% of the CPU.

Client versions are 6.41 for the GPU and 6.34 for the CPU.

Here's a screenshot for more info:

Capture.PNG

Here's the client.cfg contents, I kept most things at default:
Code:
[settings]
username=MattNo5ss
team=32
passkey=c0365e22cf9fd78cbacd6ec56c89df11
asknet=no
machineid=2
bigpackets=normal

[http]
active=no
host=localhost
port=8080
usereg=no

[core]
priority=96
cpuusage=5
disableassembly=no
checkpoint=15
ignoredeadlines=no
nocpulock=0
addr=

[power]
battery=no

[clienttype]
memory=1024
type=0
 
Last edited:
You have to have 2 client.cfg files. I presume you posted the gpu client.cfg. Change cpuusage to 100% on it and your gpu should be more fully utilized. I can't see what WU the gpu is working on in the info you posted (the log will show that).
 
Yep, it's the GPU config.

I assumed the GPU didn't need the CPU to run F@H, I thought it was all done on the GPU. If I set the CPU usage to 100% for the GPU client, would that take away from the CPU client?

Here's the GPU log, sorry I cut out the work unit in the cmd prompt in the screenie...

Code:
# Windows GPU Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.41r2

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\F@HGPU
Executable: C:\F@HGPU\[email protected]


[21:49:42] - Ask before connecting: No
[21:49:42] - User name: MattNo5ss (Team 32)
[21:49:42] - User ID: 1FF5C1550109CC0D
[21:49:42] - Machine ID: 2
[21:49:42] 
[21:49:42] Gpu type=3 species=20.
[21:49:43] Loaded queue successfully.
[21:49:43] 
[21:49:43] + Processing work unit
[21:49:43] Core required: FahCore_15.exe
[21:49:43] Core found.
[21:49:43] Working on queue slot 01 [October 13 21:49:43 UTC]
[21:49:43] + Working ...
[21:49:43] 
[21:49:43] *------------------------------*
[21:49:43] Folding@Home GPU Core
[21:49:43] Version                2.20 (Tue Aug 2 12:06:37 PDT 2011)
[21:49:43] Build host             SimbiosNvdWin7
[21:49:43] Board Type             NVIDIA/CUDA
[21:49:43] Core                   15
[21:49:43] 
[21:49:43] Window's signal control handler registered.
[21:49:43] Preparing to commence simulation
[21:49:43] - Looking at optimizations...
[21:49:43] - Files status OK
[21:49:43] sizeof(CORE_PACKET_HDR) = 512 file=<>
[21:49:43] - Expanded 43492 -> 167707 (decompressed 385.6 percent)
[21:49:43] Called DecompressByteArray: compressed_data_size=43492 data_size=167707, decompressed_data_size=167707 diff=0
[21:49:43] - Digital signature verified
[21:49:43] 
[21:49:43] Project: 6802 (Run 14, Clone 39, Gen 426)
[21:49:43] 
[21:49:44] Assembly optimizations on if available.
[21:49:44] Entering M.D.
[21:49:46] Will resume from checkpoint file work/wudata_01.ckp
[21:49:46] Tpr hash work/wudata_01.tpr:  2680623063 356309363 1808779400 3507494402 3810102207
[21:49:46] calling fah_main gpuDeviceId=0
[21:49:46] Working on ALZHEIMER'S DISEASE AMYLOID
[21:49:46] Client config found, loading data.
[21:49:46] Starting GUI Server
[21:51:12] Resuming from checkpoint
[21:51:12] fcCheckPointResume: retreived and current tpr file hash:
[21:51:12]    0   2680623063   2680623063
[21:51:12]    1    356309363    356309363
[21:51:12]    2   1808779400   1808779400
[21:51:12]    3   3507494402   3507494402
[21:51:12]    4   3810102207   3810102207
[21:51:12] fcCheckPointResume: file hashes same.
[21:51:12] fcCheckPointResume: state restored.
[21:51:12] fcCheckPointResume: name work/wudata_01.log Verified work/wudata_01.log
[21:51:12] fcCheckPointResume: name work/wudata_01.trr Verified work/wudata_01.trr
[21:51:12] fcCheckPointResume: name work/wudata_01.xtc Verified work/wudata_01.xtc
[21:51:12] fcCheckPointResume: name work/wudata_01.edr Verified work/wudata_01.edr
[21:51:12] fcCheckPointResume: state restored 2
[21:51:12] Resumed from checkpoint
[21:51:12] Setting checkpoint frequency: 500000
[21:51:12] Completed  43500001 out of 50000000 steps (87%).
[21:58:49] Completed  44000000 out of 50000000 steps (88%).
[22:06:25] Completed  44500000 out of 50000000 steps (89%).
[22:13:58] Completed  45000000 out of 50000000 steps (90%).
[22:21:33] Completed  45500000 out of 50000000 steps (91%).
[22:29:10] Completed  46000000 out of 50000000 steps (92%).
[22:36:46] Completed  46500000 out of 50000000 steps (93%).
[22:44:21] Completed  47000000 out of 50000000 steps (94%).
 
The power management mode didn't change anything.

Setting the CPU usage to 100 in the GPU config file got the GPU going at 99%, but will the CPU client suffer from having it set this way?
 
The GPU will slow down the smp client. You have to decide if the net gain is worth the power expenditure. Most of us don't use the GPU on OC'd 2600Ks.
 
Yep, it's the GPU config.

I assumed the GPU didn't need the CPU to run F@H, I thought it was all done on the GPU. If I set the CPU usage to 100% for the GPU client, would that take away from the CPU client?


The cpu is needed to communicate with the GPU. The GPU does the computation but the CPU runs the client and everything else. As ChasR said running the GPU and the SMP client usually doesn't yield great enough ppd improvements to justify it. However it doesn't hurt to try for one or two work units. You can set the -oneunit flag to finish the current workunit and then stop.
 
I don't know anything about the ppd yet, but I'm trying to tweak the CPU usage. with 100% on both CPU and GPU clients, it takes 11mins per % (1100mins per unit) on the CPU and 1min per % (100mins per unit) on the GPU. I changed it to 50% on the GPU and lost 45s per % on the GPU but gained 2mins per % on the CPU.

What's odd though, is when I set the CPU usage to 50% on the GPU client, it now shows that my GPU usage is 0% in Afterburner/GPUz/etc., but it's not really at 0% because it's completing the unit in about the same time (45s slower per %) and it's running at 53C as apposed to the 34C idle, so it's doing something but the monitoring tools don't seem to be recognizing it.

So, you guys think I should run the GPUs on a P4 631 and/or E8700 system instead of the SB? Also, does PCIe bandwidth make a huge difference in folding like in benchmarks? I ask because I have a PCIe x16 @ x4 on the Commando, but if it limits the cards (8800 series or the GTX580s) in folding, then I won't bother.
 
The ppd/KWh on the gpus will be much higher on lesser rigs than the 2600K.

PCIe bandwidth shouldn't affect gpu production, at least on the g92 and gt200 based cards. I'm not sure about the Fermis.

IIRC on core_15 with 6.41 the cpu % setting actually throttles the gpu down to 0% to arrive at the % utilization you set. THis feature was added to prevent overheating on some of the Fermis. Afterburner refresh rate may be in sync with the throttling, which would cause it to show 0% until it got out of sync.
 
Your decision as to whether to fold the gpu on the 2600k is more difficult because the 580 is obviously a powerful gpu folder. Most of my folding gpus are installed on older dual core machines in which the cpu just cares and feeds for the gpu. However, when folding a 2600k every minute counts and the lost points add up quickly. Ideally, I would gather that a 2600k would best be allowed to run on its own without a graphics card chewing up watts while doing basically nothing. But in the real world most 2600k builds are going to have a decent gpu. Keep good notes and make your own call.

As I understand my readings at the folding@home forum the sliders are an attempt to avoid overheating gpus as ChasR mentioned. I don't use the sliders at all rather I install the "core-priority" flag on individual slots. Most often I set gpu to "low" and smp to "idle" when I'm concerned about starving my gpu folder.
 
The ppd/KWh on the gpus will be much higher on lesser rigs than the 2600K.

PCIe bandwidth shouldn't affect gpu production, at least on the g92 and gt200 based cards. I'm not sure about the Fermis.

IIRC on core_15 with 6.41 the cpu % setting actually throttles the gpu down to 0% to arrive at the % utilization you set. THis feature was added to prevent overheating on some of the Fermis. Afterburner refresh rate may be in sync with the throttling, which would cause it to show 0% until it got out of sync.

Alright, I'll try the 8800GT and 8800GTS (G92) in the P4/Commando. I also have 3 G80 cards, GTX and two GTS, to figure out what to do with. Maybe put two in the E8700/REX system. All those cards are volt-modded as well so I should be able to get some decent clocks out of them for 24/7 usage, although I haven't tried any of them for 24/7 yet...lol. I'm assuming frequency of the shaders (CUDA cores) is all that matters for folding, is that the right assumption?

Your decision as to whether to fold the gpu on the 2600k is more difficult because the 580 is obviously a powerful gpu folder. Most of my folding gpus are installed on older dual core machines in which the cpu just cares and feeds for the gpu. However, when folding a 2600k every minute counts and the lost points add up quickly. Ideally, I would gather that a 2600k would best be allowed to run on its own without a graphics card chewing up watts while doing basically nothing. But in the real world most 2600k builds are going to have a decent gpu. Keep good notes and make your own call.

As I understand my readings at the folding@home forum the sliders are an attempt to avoid overheating gpus as ChasR mentioned. I don't use the sliders at all rather I install the "core-priority" flag on individual slots. Most often I set gpu to "low" and smp to "idle" when I'm concerned about starving my gpu folder.

I think I want to keep the GTX580s on my daily system, for now at least, and I'll just fold those while I sleep. After I saw how fast my GTX580 completed a 50mil unit, I set my other up as well, and they completed 8 units last night as I slept. So, for now, I'll be doing GPUs and not worrying about the CPU.

I've been editing the cfg to set different CPU usages, I'm guessing you have sliders on a GUI somewhere? I'll try messing with the priorities too.

What are the best notes to take? I'm just keeping up with time to complete a unit.
 
I don't think Fermi Cards have independent shader clocks like the older cards do. but for the others, shader clock matters far more than core or memory.

Install HFM to monitor all those instances.
 
Yeah, the core and shader frequencies are linked to shader = 2*core in the newer GPUs. So the core has to be OCed to increase shader speed.

Ill check out HFM, thanks.
 
.
. . .What are the best notes to take? I'm just keeping up with time to complete a unit.

As you try various configurations--folding SMP only, GPU only, both, SMP
h
igher priority, GPU lower priority, etc.--on a machine, make notes on PPD
and wattage--I use a simple meter
http://www.p3international.com/products/special/P4400/P4400-CE.html
--at the wall. It should become obvious with time and experimentation what cpu/gpu combinations work best for you. For example, my Phenom II/5870 combo will yield higher PPD when I fold fahcore16 on the gpu in conjunction with SMP on the Deneb, but folding on both at the same time bests folding alone on the Deneb by only a couple thousand PPD. The early return bonus on the Deneb makes it very tempting to fold it alone. The power at the wall spikes up about 200 watts folding both as well. So my judgement call is to only fold both in the winter where the wasted power can be put to use. But that is just my opinion, the decision as to what to fold is personal hence keep good notes and fold what you want. Every point helps the team.
 
Back