• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

HT or not to HT

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

Lonely Raven

If you traded with me please leave me Heatware Sen
Joined
Feb 2, 2002
Location
Wheaton, IL
I know this has been asked in some form or another, but looking
through our past pages I see vague questions with vague
answers.

I know with Hyper Threading turned on, you can (and are
recommended) to run two instances of folding.

Now, both the motherboards I have support switching HT
on and off in the BIOS. So I could in theory just shut it off
and run one instance of Folding.

Now, I've only had HT running for a few days, so I can't
answer myself yet. But is there a speed/time advantage to
running two instances of Folding on HT, VS one instance no HT?
 
Good ?? The only way to know for sure will be to try it both ways... I have HT on my laptop ON and run 2 instances. Seems to pump out about 650PPW or so.
 
I've got one HT dual box, but I have nothing but the vague answers you got already. I run 4 clients to keep it busy.

HT allows multithreaded clients to execute threads in parallel, so if FAH client is multithreaded, it should benefit.

Is the FAH client multithreaded? Not sure.
 
grv said:
I've got one HT dual box, but I have nothing but the vague answers you got already. I run 4 clients to keep it busy.

HT allows multithreaded clients to execute threads in parallel, so if FAH client is multithreaded, it should benefit.

Is the FAH client multithreaded? Not sure.
It is NOT multithreaded in the sense that if you run ONE client on a multi CPU machine, it will NOT finish a WU twice as fast. So in that sense, no it is not. The obvious was to make it 'multithreaded' is to install multiple instances and set the machine ID's.
 
and in that case, the overhead of dividing up the work on a HT enabled system outweighs any benefit, IMO. On a multi proc unit, it's nice to not have to assign clients to specific cpus, like you do when HT is off, but aside from that, it would appear there is no benefit.
 
grv said:
and in that case, the overhead of dividing up the work on a HT enabled system outweighs any benefit, IMO. On a multi proc unit, it's nice to not have to assign clients to specific cpus, like you do when HT is off, but aside from that, it would appear there is no benefit.
Not so sure that HT is of no benefit, I have no absolute proof but I can offer this: My 3.06 HT laptop with 2 instances finsihes 2 p909's (38 pointer) in 14 Hours, 9 minutes, 8:29/Frame. Now I don't think if I had HT turned off and only running 1 instance, that one p909 would finish in 7 hours 5 minutes. It takes my 2525 Mhz XP (recalling from memory) about 8.5 hours to complete a p909. I think HT does benefit.
 
nikhsub1 said:

Not so sure that HT is of no benefit, I have no absolute proof but I can offer this: My 3.06 HT laptop with 2 instances finsihes 2 p909's (38 pointer) in 14 Hours, 9 minutes, 8:29/Frame. Now I don't think if I had HT turned off and only running 1 instance, that one p909 would finish in 7 hours 5 minutes. It takes my 2525 Mhz XP (recalling from memory) about 8.5 hours to complete a p909. I think HT does benefit.

are you serious????? my 3ghz p4 without hyperthreading does each p909 frame in 4:30..... that makes no sense dude..... why does my rig do it twice as fast as yours with same speed??
 
Wow, that is a big difference Xev.

You guys post some HTML or Screen Cap of your finished WU
times?

Like I said, I've only had mine for a few days, so I have no
data yet to check it myself. It would take me about two weeks of
completed WU to get an idea. One week with HT on, one week
with HT off just to get a solid feel of protien times.
 
[11:28:47] Project: 909 (Run 5, Clone 34, Gen 2)
[11:28:47]
[11:28:47] Assembly optimizations on if available.
[11:28:47] Entering M.D.
[11:28:54] Protein: p909_vill_str0
[11:28:54]
[11:28:54] Writing local files
[11:28:55] Extra SSE boost OK.
[11:28:57] Writing local files
[11:28:57] Completed 0 out of 250000 steps (0)
[11:33:31] Writing local files
[11:33:31] Completed 2500 out of 250000 steps (1)
[11:38:05] Writing local files
[11:38:05] Completed 5000 out of 250000 steps (2)
[11:42:40] Writing local files
[11:42:40] Completed 7500 out of 250000 steps (3)
[11:47:14] Writing local files
[11:47:14] Completed 10000 out of 250000 steps (4)
[11:51:48] Writing local files
[11:51:48] Completed 12500 out of 250000 steps (5)
[11:56:22] Writing local files
[11:56:22] Completed 15000 out of 250000 steps (6)
[12:00:56] Writing local files
[12:00:56] Completed 17500 out of 250000 steps (7)
[12:05:31] Writing local files
[12:05:31] Completed 20000 out of 250000 steps (8)
[12:10:05] Writing local files
[12:10:05] Completed 22500 out of 250000 steps (9)
[12:14:39] Writing local files
[12:14:39] Completed 25000 out of 250000 steps (10)
[12:19:13] Writing local files

see? hovering arouond 4:40 frame times on p909.... i have 1.8a running at 167fsb, 3006mhz, no ht obviously. my memory is running at 417mhz (might be relevent since mem bandwidth unleashes the beast within the p4).
 
Xevuhtess7 said:


are you serious????? my 3ghz p4 without hyperthreading does each p909 frame in 4:30..... that makes no sense dude..... why does my rig do it twice as fast as yours with same speed??
Because I am running TWO instances. You are running ONE. OK so this may confirm that HT is of very little benefit after all. If you get 4:30/frame with one instance, and I get 8:29/Frame with 2 instances, it seems HT is giving about a 6sec/frame advantage.

8:29 = 509 sec. / 2 = 254.5 sec = 4:24/Frame vs. 4:30/Frame for a 3Ghz non HT CPU running one instance.
 
oooohhhhhhh sorry im really confused about how the two instances thing works. i thought that they worked on two different wu's not teh same one... so i guess the frame times are on par then.

OH OH OH i jsut got an idea!! what if that new core thats comin out anytime now addes ht optimizations or sumtin, that would be SICK
 
Here is instance 1:

[18:21:33] Project: 909 (Run 22, Clone 72, Gen 11)
[18:21:33]
[18:21:33] Assembly optimizations on if available.
[18:21:33] Entering M.D.
[18:21:54] (Starting from checkpoint)
[18:21:54] Protein: p909_vill_str0
[18:21:54]
[18:21:54] Writing local files
[18:21:55] Completed 80000 out of 250000 steps (32)
[18:21:55] Extra SSE boost OK.
[18:30:22] Writing local files
[18:30:22] Completed 82500 out of 250000 steps (33)
[18:38:35] Writing local files
[18:38:35] Completed 85000 out of 250000 steps (34)
[18:46:44] Writing local files
[18:46:44] Completed 87500 out of 250000 steps (35)
[18:54:57] Writing local files
[18:54:57] Completed 90000 out of 250000 steps (36)

And Instance 2:

[18:21:36] Project: 909 (Run 19, Clone 78, Gen 12)
[18:21:36]
[18:21:36] Assembly optimizations on if available.
[18:21:36] Entering M.D.
[18:21:56] (Starting from checkpoint)
[18:21:56] Protein: p909_vill_str0
[18:21:56]
[18:21:56] Writing local files
[18:21:58] Completed 127500 out of 250000 steps (51)
[18:21:58] Extra SSE boost OK.
[18:30:25] Writing local files
[18:30:25] Completed 130000 out of 250000 steps (52)
[18:38:38] Writing local files
[18:38:38] Completed 132500 out of 250000 steps (53)
[18:46:46] Writing local files
[18:46:46] Completed 135000 out of 250000 steps (54)
[18:55:00] Writing local files
[18:55:00] Completed 137500 out of 250000 steps (55)
[19:03:08] Writing local files
[19:03:08] Completed 140000 out of 250000 steps (56)
[19:11:20] Writing local files
[19:11:20] Completed 142500 out of 250000 steps (57)
 
wait im sitll really confused how the dual instances work..... both instances are working on the same wu? or what? or do they take turns doing frames, like cpu1 does frame 1 while cpu2 does frame 2, then cpu1 does frame 3 while cpu2 does frame 4?
 
Xevuhtess7 said:
wait im sitll really confused how the dual instances work..... both instances are working on the same wu? or what? or do they take turns doing frames, like cpu1 does frame 1 while cpu2 does frame 2, then cpu1 does frame 3 while cpu2 does frame 4?
No. 2 instances means that I am running 2 FAH clients at the same time... This is how you have to do it on multi CPU machines. If you don't, then on a dually with only one instance of FAH, you will only get 1 CPU working. With 2 instances, both CPU's run at 100%. And since windows thinks a HT machine is a dual cpu, you must run 2 FAH applications at once to utilize 100% of the CPU.

FAH is NOT multi-threaded! One unit WILL NOT complet twice as fast on a multi CPU machine!
 
ok.... that helps but still not clear on the technical stuff. each instance works on the same wu or two different ones? if they work on the same one.... then i dont understand how having both cpu's at 100% still yields the same speed as 1 cpu at 100%....
 
On mine, each instance has a different machine ID and does it's own wu. dual proc = 4 clients = 4 wu.

They each operate at roughly 25% of the full capability of the cpu
 
Windows see HT as two "Virtual" processors.

So if you run Folding on your machine, you will only see 50%
processor utilization.

Therefor, you have to run two SEPERATE instances, one for each
virtual processor in order to utilize 100% processor power.
 
Here is where I'm at on ONE of my setups.

This is a 2.4C with HT enabled, but I'm running in on an 845
board so my FSB is only 220X12 for 2640


[17:35:27] Project: 909 (Run 2, Clone 73, Gen 17)
[17:35:27]
[17:35:27] Assembly optimizations on if available.
[17:35:27] Entering M.D.
[17:35:47] (Starting from checkpoint)
[17:35:47] Protein: p909_vill_str0
[17:35:47]
[17:35:47] Writing local files
[17:35:50] Completed 95000 out of 250000 steps (38)
[17:35:50] Extra SSE boost OK.
[17:44:47] Writing local files
[17:44:47] Completed 97500 out of 250000 steps (39)
[17:53:47] Writing local files
[17:53:47] Completed 100000 out of 250000 steps (40)
[18:02:48] Writing local files
[18:02:48] Completed 102500 out of 250000 steps (41)
[18:11:45] Writing local files
[18:11:45] Completed 105000 out of 250000 steps (42)
[18:20:46] Writing local files
[18:20:46] Completed 107500 out of 250000 steps (43)
[18:28:54] Writing local files
[18:28:54] Completed 110000 out of 250000 steps (44)
[18:37:45] Writing local files
[18:37:45] Completed 112500 out of 250000 steps (45)
[18:46:36] Writing local files
[18:46:36] Completed 115000 out of 250000 steps (46)
[18:55:28] Writing local files
[18:55:28] Completed 117500 out of 250000 steps (47)
[19:04:19] Writing local files
[19:04:19] Completed 120000 out of 250000 steps (48)
[19:13:09] Writing local files
[19:13:09] Completed 122500 out of 250000 steps (49)
[19:22:00] Writing local files
[19:22:00] Completed 125000 out of 250000 steps (50)
[19:30:51] Writing local files
[19:30:51] Completed 127500 out of 250000 steps (51)
[19:39:42] Writing local files
[19:39:42] Completed 130000 out of 250000 steps (52)
[19:48:33] Writing local files
[19:48:33] Completed 132500 out of 250000 steps (53)
[19:57:23] Writing local files
[19:57:23] Completed 135000 out of 250000 steps (54)
 
Back