• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

CUDA performance

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.
yah i think i only have 32 sp's on my 9500gt. :/

What Gflops number does Boinc give you when you're at stock clocks? 16 or so?

Edit: Sorry, just went back through and saw you mentioned it earlier ;)

I've got a little spread sheet going, in an attempt to figure out a formula...

If possible can all the CUDA guys post up the Gflops number reported from the messages tab, using stock card clocks, also include your model of card.
 
Last edited:
GeForce GTX 260 (driver version 19062, compute capability 1.3, 896MB, est. 100GFLOPS)

Also, I'm not sure whats up with my shader clocks.. card is at complete stock, havent even installed coolbits to OC it yet o_O yet seems to be reporting as being 100mhz below stock. Weird!

Hope this helps :)

EDIT: Got it sorted in the end.. just used Riva to "overclock" it back to stock speeds of 650/1000/1400.. out of interest, which is more important to OC with regards to SETI? I'm assuming the shaders, but when listing specs up the top, QIce has given core clocks?

Also, when giving stats like "GTX 260OC2 (G200) ~5500 216:72:28 @ 630", what does the 72 refer to? 216=shaders, 28=ROPs :eek:
 
Last edited:
Seti@home just updated their Tasks section under each computer, and it now has Run Time in (sec) in addition to CPU time (sec). This will help with our calculations of RAC of Video cards...no more guessing!
 
Seti@home just updated their Tasks section under each computer, and it now has Run Time in (sec) in addition to CPU time (sec). This will help with our calculations of RAC of Video cards...no more guessing!

Yeah I saw this when I woke up a few hours ago! I was somewhat confused when looking at CUDA results - now its clear :)

Now if only they had a way of sorting results by time taken/name/whatever.. Id love to see that
 
As an update. I have a rig that's a dedicated CUDA rig. I run Rosetta on the CPU just for fun (it's just a single core 3500+).

GTX260 192sp 684/1440/1053 core/shader/memory. RAC has bounced up and down around 6000K for the last 4 days. I'm going to switch up the cards now. Will update when that's ready.
 
As an update. I have a rig that's a dedicated CUDA rig. I run Rosetta on the CPU just for fun (it's just a single core 3500+).

GTX260 192sp 684/1440/1053 core/shader/memory. RAC has bounced up and down around 6000K for the last 4 days. I'm going to switch up the cards now. Will update when that's ready.

Hmm. I know you've switched cards to the 95GT, but were you running the CUDA2.3 dlls on your GTX260 192? 6k seems a little low. I'd be guessing around 8k to maybe 9k RAC with that card.. My GTX260 216 @ 701/1509/1045 [c/s/m] on a single core 3000+ is pushing 11500 RAC (still hasn't stabilised yet). The even with 24 fewer shaders and slightly lower clocks, I wouldn't think it would drop the RAC by 5k+.

Only thing I can think of is either using the CUDA2.2 dlls, or the fact that my CPU is dedicated to feeding the GPU, whereas yours is running Rosetta.

Interesting indeed!
 
Hmm. I know you've switched cards to the 95GT, but were you running the CUDA2.3 dlls on your GTX260 192? 6k seems a little low. I'd be guessing around 8k to maybe 9k RAC with that card.. My GTX260 216 @ 701/1509/1045 [c/s/m] on a single core 3000+ is pushing 11500 RAC (still hasn't stabilised yet). The even with 24 fewer shaders and slightly lower clocks, I wouldn't think it would drop the RAC by 5k+.

Only thing I can think of is either using the CUDA2.2 dlls, or the fact that my CPU is dedicated to feeding the GPU, whereas yours is running Rosetta.

Interesting indeed!
You don't use your CPU for crunching?

I wouldn't think there would be much difference between running SETI on the CPU and running Rosetta on the CPU - at least not as far as the GPU goes. It would add 500-600 RAC to the rig ...
 
You don't use your CPU for crunching?

I wouldn't think there would be much difference between running SETI on the CPU and running Rosetta on the CPU - at least not as far as the GPU goes. It would add 500-600 RAC to the rig ...

Sadly, I don't run SETI on the CPU. If I do, it lowers RAC by ~2K rac. From what I've gathered from the SETI forums, and what little I know of hardware, heres a possible reason.

The GPU needs to be constantly fed information to crunch (can't simply load the whole ap and WU into the GPU and run it natively - CUDA's not that developed... yet), and this can be seen in windows Task Mangler as an intermittent 13-20% CPU spike every second or so. If you run CPUSETI as well as GPU, the two apps then fight for CPU cycles (as both are defaulted to low prority). In doing this, CPUSETI uses ~80% of the CPU and the other 20% is the CUDA app.. however, everytime the CUDA app needs to feed the GPU, it has to flush the CPU pipeline of the CPUSETI threads, start the GPU thread, then when that terminates, reload the CPUSETI thread.. rinse/repeat.

The constant having to flush/load the different CPU and GPU threads causes a massive decrease in RAC (as the GPU is -vastly- quicker than CPU), particularly on single core CPUs. For multi-core, it's less of an issue because you have more cores to play with, the flushing/loading of threads doesn't impact as much, but it still does impact.

As an example, rigs owned by Vyper and Sutaru Tsureku on the SETI forums are dedicated GPU crunchers, and they've dramatically increased their RACs by stopping CPU crunching. As a rule of thumb, for every GPU you have in your rig, keep 1 physical CPU core deciated to feeding it, and you'll increase production quite a lot. (Note that I said physical core, as if you choose to simply leave a logical core free [as in the P4/i7 family], you run into the same flush/load problem as stated above, just not as badly as you'd see on a single core).

Hope this helps! sorry for the long-winded explanation
 
I have been watching Careface closely (as we are neck and neck) and I was amazed at his 260gtx performance. Mine was only at 7,500 rac max with CPU crunching enabled. I just now disabled my CPU crunching and am seeing if I can duplicate his efforts! Though my clocks are slightly lower.
 
Hmm. I know you've switched cards to the 95GT, but were you running the CUDA2.3 dlls on your GTX260 192? 6k seems a little low. I'd be guessing around 8k to maybe 9k RAC with that card.. My GTX260 216 @ 701/1509/1045 [c/s/m] on a single core 3000+ is pushing 11500 RAC (still hasn't stabilised yet). The even with 24 fewer shaders and slightly lower clocks, I wouldn't think it would drop the RAC by 5k+.

Only thing I can think of is either using the CUDA2.2 dlls, or the fact that my CPU is dedicated to feeding the GPU, whereas yours is running Rosetta.

Interesting indeed!

Well, I had thought the RAC had stabilized seeing it bounced up and down for the last 4 days. If I look at the DAC on boincstats, for the last week, it's DAC was 6851. When I get home tonight, I'll put the GTX260 192 back in with the CPU idle and see how that performs over the next week or so. Yes, I was using the 2.3dll's.

Now that you mention it though, Since I added the Athlon II X4 to my rig 1 (too lazy to update my sig) instead of the Kuma that was in there, my RAC has jumped dramatically. More than I would have expected from going from 2 to 4 cores on the CPU. It might be that the extra CPU core, plus running 1 less GPU has resulted in greater efficiency of the remaining GPU. Very interesting indeed. I think I might be getting a RAC bump pretty soon.
 
Sadly, I don't run SETI on the CPU. If I do, it lowers RAC by ~2K rac. From what I've gathered from the SETI forums, and what little I know of hardware, heres a possible reason.

The GPU needs to be constantly fed information to crunch (can't simply load the whole ap and WU into the GPU and run it natively - CUDA's not that developed... yet), and this can be seen in windows Task Mangler as an intermittent 13-20% CPU spike every second or so. If you run CPUSETI as well as GPU, the two apps then fight for CPU cycles (as both are defaulted to low prority). In doing this, CPUSETI uses ~80% of the CPU and the other 20% is the CUDA app.. however, everytime the CUDA app needs to feed the GPU, it has to flush the CPU pipeline of the CPUSETI threads, start the GPU thread, then when that terminates, reload the CPUSETI thread.. rinse/repeat.

The constant having to flush/load the different CPU and GPU threads causes a massive decrease in RAC (as the GPU is -vastly- quicker than CPU), particularly on single core CPUs. For multi-core, it's less of an issue because you have more cores to play with, the flushing/loading of threads doesn't impact as much, but it still does impact.

As an example, rigs owned by Vyper and Sutaru Tsureku on the SETI forums are dedicated GPU crunchers, and they've dramatically increased their RACs by stopping CPU crunching. As a rule of thumb, for every GPU you have in your rig, keep 1 physical CPU core deciated to feeding it, and you'll increase production quite a lot. (Note that I said physical core, as if you choose to simply leave a logical core free [as in the P4/i7 family], you run into the same flush/load problem as stated above, just not as badly as you'd see on a single core).

Hope this helps! sorry for the long-winded explanation
Glad for the explanation - Thanks! :)
 
I have been watching Careface closely (as we are neck and neck) and I was amazed at his 260gtx performance. Mine was only at 7,500 rac max with CPU crunching enabled. I just now disabled my CPU crunching and am seeing if I can duplicate his efforts! Though my clocks are slightly lower.

If the machine you're talking about is the one in sig (your system are hidden on SETI) then you'll see more RAC than mine :) that dedicated core will help out. As it stands, my card isn't at its full potential - it's been shown that a factory OC'd GTX216 @ 675/1458/1152 will pull 13-13.5k RAC if its given a dedicated core, and run 24/7.. mines currently @ 701/1509/1045 (god knows how eVGA get their cards mem clocks to 1152!!) and has topped out at 11.5k RAC.

While memclocks do factor into RAC (seems to be +100mhz on RAM gives ~3% higher RAC, though it doesn't seem that it scales linearly; the core and shaders aren't exactly bottlenecked by the RAM), my card could probably pull a max of 14k RAC if I chucked it into a multicore system (core i7 setup is on the way :), and maybe another GTX216, so watch out!)

I'd estimate you're looking at around 12k RAC with your card 24/7 dedicated.

Well, I had thought the RAC had stabilized seeing it bounced up and down for the last 4 days. If I look at the DAC on boincstats, for the last week, it's DAC was 6851. When I get home tonight, I'll put the GTX260 192 back in with the CPU idle and see how that performs over the next week or so. Yes, I was using the 2.3dll's.

Now that you mention it though, Since I added the Athlon II X4 to my rig 1 (too lazy to update my sig) instead of the Kuma that was in there, my RAC has jumped dramatically. More than I would have expected from going from 2 to 4 cores on the CPU. It might be that the extra CPU core, plus running 1 less GPU has resulted in greater efficiency of the remaining GPU. Very interesting indeed. I think I might be getting a RAC bump pretty soon.

Wicked, good to hear mate :) I haven't heard much about 192sp 260's, but at a pure guess I'd say you'd be looking at around 9.5-10k RAC if you dedicated a core and ran it 24/7. Not sure how much 24 fewer sps would impact on it.. But let me know! Totally getting into this GPU crunching thing; it's fun! :burn:

Glad for the explanation - Thanks! :)

Anytime mate :) I'm trying to learn as much as I can about crunching and hardware - going to try merge the two to get <crysis voice>Maximum Performance</crysis> out of my rig :beer:
 
This is a fairly dumb question but how can you tell if your gpu is crunching? I put the 8400GS back in my T7200 computer about a week ago and BOINC sees it, but GPUZ is saying the temperature for the card doesn't change whether BOINC is running or not. The computer hadn't been crunching for a while so the RAC was still climbing when I added the card.
 
This is a fairly dumb question but how can you tell if your gpu is crunching? I put the 8400GS back in my T7200 computer about a week ago and BOINC sees it, but GPUZ is saying the temperature for the card doesn't change whether BOINC is running or not. The computer hadn't been crunching for a while so the RAC was still climbing when I added the card.


Look under the tasks in the Boinc Manager, GPU units will be labeled
Code:
setiathome_enhanced 6.08 (cuda)
If there are no tasks there, then you may need to re-optimize, and make sure you select CUDA in the optimization wizard. By default it is not checked.
 
Look under the tasks in the Boinc Manager, GPU units will be labeled
Code:
setiathome_enhanced 6.08 (cuda)
If there are no tasks there, then you may need to re-optimize, and make sure you select CUDA in the optimization wizard. By default it is not checked.

That may have been it, I'll give it a day or two to get the gpu units. Thanks.

Update: Yeah, must have not checked it since I just got a bunch of cuda apps.

Update #2: And it wasn't worth it. It looks like it's taking both cores almost twice as long to finish a WU (took just under 2 hours, one of the cores has been on a WU for that long and its at 55% and the others been working on one for 45 minutes and is only 22% done) and the blistering fast 8400 will probably take about 8 hours. Is SETI heavily gpu speed based or bandwidth based? The card only has a 64bit memory bus but it still running fairly cool (~55c load, 45c idle) so I might give it a boost in clock speed if that would help.
 
Last edited:
Back