• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

My New Vega 56 Cooling Mod

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.
OP
crull

crull

Member
Joined
Dec 19, 2002
The lower noise and temps is worth it to me alone. I have tried clocking a little bit higher and I think because of the thermals it does less throttling. The best score Iv'e seen has been around 18,600 but I don't think it was stable. The firestrike extreme score I'm getting is around 10,500 which I think is pretty good for my Ryzen cpu. I'm not comfortable with the hot spot temperature when I'm running out of spec wattage. I'm trying to figure out how to lower that temp. I did figure out its not related to the vrm's. I did a test with the fan on the H55. I ran a load test with a constant 800 rpms and wrote down all the temp numbers. I then repeated the load test with the H55 fan at full 2,000 rpms. The vrm temps didn't change at all between the two tests. The GPU, HBM & Hot Spot temps all went up or down equally. One thing I noticed is, if you add the HBM and GPU temps together the value is close to what the Hot Spot Temp is. My next step is to add a heat sink and thermal pad to the back of the card in the middle of the mounting bracket to see if that helps at all.

- - - Auto-Merged Double Post - - -

Thanks!!!
 
OP
crull

crull

Member
Joined
Dec 19, 2002
I think I finally figured out the hot spot temperature. I've been doing a lot of reading about it. I've been testing and fine tuning an overclock with extra wattage, but trying to find a spot where the hot spot temperature under load didn't go any higher then 100C. I found a good stable overclock which gives me around 5,400 firestrike ultra and around 6,200 4K Superposition Benchmark with the hot spot going no higher then around 100C under load.
Last night I decided to pull the card out and try a couple of things with the paste. I cleaned it all up really well and carefully cleaned out the crevice really well between the dies. When I had it cleaned up I though could see a dot between the dies which might be a thermal sensor. I took some ic diamond and very carefully scraped a little bit at a time into the complete crevice with a piece of credit card. I then took some more regular ic diamond and put a good spread on the h55 copper plate, a little bit on the thick side and clamped the h55 down tightly and evenly.

Running the timespy stress test all my temps are around the same as before, but the hot spot temperature is now around 20C less then it was. It's now around 78c under load.
 
OP
crull

crull

Member
Joined
Dec 19, 2002
Screenshot.jpg

After firestrike benchmark with maximum temps and ambient temp of 25c. Now that the hot spot temperature is a lot lower I can work on increasing the wattage with higher clocks.
 

Woomack

Benching Team Leader
Joined
Jan 2, 2005
Nice temps on after the mod :thup:

I wasn't interested in Vega because of high prices but since it went down by nearly 60% in last months then I got one and I'm a bit surprised of how it runs. My first idea was to change cooling what I will probably do in some time but for now stock cooler on my card is more than enough to play around. Actually it runs quiet and card is not getting too hot. On the other hand I see that when it reaches some temp then it starts to throttle a bit. I don't know how it's acting on AIO but on air, lowering voltage is helping to keep higher boost frequency without drops. When I lowered voltage from 1.2V to 1.1V then my card could run at 50MHz higher clock what gave 1702MHz in benchmarks.

Btw. on my test rig with [email protected], Fire Strike score is ~18000, after OC ~21000.

I'm not sure where you had 100°C. This card should shut down at this temp. Max core is 90+ where absolute max is 100°C for some other spots ... card will start throttling at 70+. I haven't seen much above 70 on my card but I have the one with nano PCB so all is more packed near GPU. This is also additional challenge regarding cooling mods.
 
Last edited:

Zerileous

Senior Member
Joined
Jun 21, 2002
Yeah what Woomack said. Strangely enough my card seems to use more power and not even clock higher at 1.2v. Also people seem to get great results clocking their HBM over 1000MHz, and continue to see improvements up to 1100MHz. Worth noting, the "memory voltage" in Global WattMan doesn't seem actually change the voltage to the HBM. I wouldn't worry much about changing it, only make sure it's less than your voltage at State 7, as it serves as some kind of lower limit. At least that's what I've read, but it's not entirely clear to me how the chips behaves in State 1 if the memory voltage is set to > State 1 voltage.
 

Woomack

Benching Team Leader
Joined
Jan 2, 2005
I was playing some more with my card and I was able to set higher clocks but a lot depends on cooling.
Vega 56 runs up to 1700MHz on gpu and 950MHz on memory. Simply can't pass it without additional mods. Once I flashed my card with Vega 64 BIOS, I could set memory clock up to 1050MHz. On the other hand, I have to set gpu voltage to 1.1V just so card keep stable clock without thermal throttling. It's because throttling is starting above 60°C and is visible at about 70°C+.
There is one more thing. Most Vega 64 have Samsung HMB which is clocking ~50-100MHz higher than Hynix/Micron. My card has Hynix HBM which works stable at 1000-1025MHz but at 1050MHz can already see single artifacts from time to time. Still 100% stable in benchmarks.

I can't complain on my card's cooling but I have limited options if I wish to upgrade it because my card is based on nano PCB. It looks like this:
HTB1FrhQKgaTBuNjSszfq6xgfpXav.jpg

Sapphire Pulse cooler is good enough to keep my card at ~65°C in benchmarks after OC when I set higher fan speed.
 

Zerileous

Senior Member
Joined
Jun 21, 2002
Hrm didn't expect thermal throttling to start that low. I guess the bios expects air cooling so starts to back it off earlier so it doesn't overheat. I know I see 60c GPU temps, never 70c though. Maybe that's whats holding me back. I wonder if the Vega64LC bios has a less aggressive thermal throttle. Definitely a different looking Vega you've got there.
 

Kenrou

Member
Joined
Aug 14, 2014
I remember railing that my Strix 980ti started throttling ~62c, impossible to stop it going over on air if you like games with all the bells and whistles. Water is the cure, never saw a C above 55c after 😉
 

Zerileous

Senior Member
Joined
Jun 21, 2002
I think I need to take better notes. After almost 20 mins of heaven (ultra 1440p) GPU clocks varied between 1690MHz and 1705MHz with power limit at 300w. Actual usage read between 275W and 300W. Max temp was 51c. Set clock was like 1720MHz, never saw close to that but had to set that high to get the clocks I did. Not sure why it works this way, maybe its just how GPUs work, I don't fully understand how they set everything.
 

Woomack

Benching Team Leader
Joined
Jan 2, 2005
Clock fluctuates depends on temp, voltage, power limit and gpu load. The whole point is to keep stable frequency. For example, you may see higher performance at ~1600MHz than at ~1700MHz+ with random frequency drops (then it drops to ~1500MHz quite often).
When Vega is on water and under 60°C all the time then it shouldn't need anything more and will boost higher than at 65°C+. When it's on air then lowering voltage is helping a lot. Actually I was able to run it in some tests @1800MHz+ when gpu voltage was ~1.1V but I had to set fans to the max so ~3100rpm to keep lower temps.

In [email protected] I was able to run the card stable for 24h+ @1630 with boost up to 1700MHz and 1.05V. GPU-Z was showing 130W during work so pretty nice efficiency considering that I've seen 0.7-1.4m ppd (depends on project). On 1080Ti with power limit at 75% so about 160W, it was 0.8-1.2m ppd.
 

Zerileous

Senior Member
Joined
Jun 21, 2002
Thanks Woomack. I haven't seen temps in the 60s at all, I must have been confused before. It sounds like you have a very sweet piece of silicon there.

What are you using to set your overclock? In Global WattMan Ryzen master there are 7 power states with individual clocks and voltage to adjust. AB just gives you one clock to adjust.

Are you doing anything to feed the GPU extra power, or are you working with the normal 300W Vega 64 limit from your bios flash?

Thanks for the tip on stabilizing the clock speed. I was noticing that on some Gamers Nexus videos as well, good to have a confirmation. I will start re-working my OCs to achieve stable clocks and see where that gets me.
 
Last edited:

Woomack

Benching Team Leader
Joined
Jan 2, 2005
My card seems a bit above the average regarding GPU. Memory is not so good as there is Hynix HBM and artifacts and instability are starting to show at about 1050MHz. Comparing to typical Samsung results, it's about 50-100MHz worse.
My card has 180W TDP design but in real it has the same power section as reference card, just packed on a small PCB. I used Sapphire Nitro+ BIOS which bumps clocks from 1590/800MHz to 1630/945MHz. This is also max stable frequency for my card. For benching I'm setting +50% power, +5-7% clock and last 2-3 voltages to 1.1V. Sometimes only last one is active and I'm not sure why. Anyway only the last one is important during higher GPU load. When I set voltage with AB then results were similar so I guess that other voltages don't really matter.

Higher GPU voltage can help when you have your card on water and temps are as good as yours but I'm not sure if you will see the difference. Some guys around the web who were trying various mods were keeping their cards on water and still limited GPU voltage to ~1.1V.
Even though most of these cards have similar results then try what you can do on yours.

I was thinking to change the cooler on my card but I already reached or passed most others who were benching on water and for everything else I don't need water cooling so I will stick with stock Sapphire cooler. If I won't push it higher then it's quiet and is not heating up much.


One 3DMark result on my card.

3dm.jpeg


Btw. there are not many threads about Vega in general so I guess we can keep mods and OC results in this thread so it will be easier to find.
 
Last edited:

Zerileous

Senior Member
Joined
Jun 21, 2002
Thanks for the tips. How much clock fluctuation is acceptable? Obvious 100MHz is too much, what about 50MHz, 25Mhz, should it be a rock solid line? I'm watching the graph in Watt Man and trying to get things stable.

My Samsung HBM does 1100MHz easily, artifacts above that. I was going to purchase a Asrock Vega 56 but went with the Sapphire Vega 64 for about $70 more at the time because I had read that most of the Asrock's were shipping with Hynix HBM and the Sapphire cards had Samsung. At the end of the day not sure if the 50MHz is worth it, but I kind of had an obsession with memory when I was building this system.
 

DaPoets

Member
Joined
Aug 23, 2007
Loving the info in this thread. Just put my ASUS Strix VEGA 64 on water so I have some OC playing around w/ to do...
 

Zerileous

Senior Member
Joined
Jun 21, 2002
Maybe we do need to spin off a Vega OC thread....

I think I need to find a different program to stress the GPU when adjusting clocks. I've been using Heaven 4.0 1440p and I have to go insanely low to minimize fluctuations. Even at 1442MHz and 1.2V set I see fluctuations in actual clocks between 1379MHz and 1344MHz. I must be misunderstanding something here :confused:, unless heaven is just not a consistent enough load. I could use folding, but I don't want to risk returning inaccurate data.
 

DaPoets

Member
Joined
Aug 23, 2007
Maybe we do need to spin off a Vega OC thread....

I think I need to find a different program to stress the GPU when adjusting clocks. I've been using Heaven 4.0 1440p and I have to go insanely low to minimize fluctuations. Even at 1442MHz and 1.2V set I see fluctuations in actual clocks between 1379MHz and 1344MHz. I must be misunderstanding something here :confused:, unless heaven is just not a consistent enough load. I could use folding, but I don't want to risk returning inaccurate data.

Perhaps a spin off vega thread would be helpful. I'll be messing w/ my "OC" tonight so if a thread is up I'll add what I've done. I'll have questions I'm sure lol
 
OP
crull

crull

Member
Joined
Dec 19, 2002
Since the last time I posted I have been testing all different drivers, bioses and registry soft power play tables. My vega 56 has hynix memory which going by other posts wasn't flashable, but I have flashed it many times with different vega 64 bioses. I can confirm the voltage increase in HWINFO to 1.356 instead of 1.250. I couldn't get the memory higher than 925 without that extra voltage. It seemed to be really stable running sky diver and firestrike stress tests, but whenever I ran time spy stress it would crash. I tried different bioses and drivers and just couldn't get it to be stable running the time spy stress test so I went back to 56 bios.
The most reasonable and stable undervolt I have come up with is 1642 P6 (1120mv) & 1662 P7 (1125mv). I also add a softpower play table to the registry with a wattage increase to 220W and a power limit set to +142%. I loop firestrike ultra and with those settings the clock settles to around 1610 after a few loops and the average wattage usage is roughly 275 watts. The 100c temperature I was referring to in my previous post is the hot spot temperature which you should keep an eye on. The other temperatures can all be pretty low but that one can still be high. It is set to 105c maximum in the bioses if it gets to that the card will throttle. I haven't noticed any throttling below that 105c maximum. Since my last re-pasting and running the undervolt above the hot spot temperature is now averaging 90c underload depending on ambient temperature.

I also think I got a very deal with this card which I bought used for $240 when I bought it. I did spend some time and money getting it to where it is now but I still think I'm ahead.
 
Last edited:
OP
crull

crull

Member
Joined
Dec 19, 2002
If you can get a paid copy of 3dmark, you can loop the tests in a window using the custom option. I find that the Firestrike Ultra will stress the card the most as far as heat and wattage goes. That is the worst case test because most of the other tests don't seem to stress the card as much. I use all the other tests for stabilty but firestrike ultra for temps and wattage.
 

Woomack

Benching Team Leader
Joined
Jan 2, 2005
If you mean temps and wattage then maybe but it's crashing faster in all "easy" tests where FPS are much higher. On my card I had to lower core clock by about 20-30MHz to pass Ice Storm or Night Raid.