• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

My New Vega 56 Cooling Mod

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

Zerileous

Senior Member
Joined
Jun 21, 2002
I think I'll just bite the bullet for that then. SuperPosition works well too, but it would be nice to have the comparison with different reviews :D

I was literally going to ask you where you found your hot spot temp in HWiNFO64, because I have been wondering if that's been impacting my performance at all, and I just now looked at my GPU in that program. It's the third one. It's been there the whole time. :bang head: I wonder if I can get that to read out in my OSD (still been using Afterburner for the OSD, maybe this is a good opportunity to set up the HWiNFO64 OSD). After a cursory search engine review, it looks like the hot spot might be difficult to isolate. Mine is 11C warmer than the GPU temp running [email protected] with 1050mV. Switching to a 1200mV OC profile and running heaven for a few minutes, HBM and "core" stay at 45C, however hotspot reads 20C above that at 65C, possibly well into throttling territory.
 
OP
crull

crull

Member
Joined
Dec 19, 2002
I think I'll just bite the bullet for that then. SuperPosition works well too, but it would be nice to have the comparison with different reviews :D

I was literally going to ask you where you found your hot spot temp in HWiNFO64, because I have been wondering if that's been impacting my performance at all, and I just now looked at my GPU in that program. It's the third one. It's been there the whole time. :bang head: I wonder if I can get that to read out in my OSD (still been using Afterburner for the OSD, maybe this is a good opportunity to set up the HWiNFO64 OSD). After a cursory search engine review, it looks like the hot spot might be difficult to isolate. Mine is 11C warmer than the GPU temp running [email protected] with 1050mV. Switching to a 1200mV OC profile and running heaven for a few minutes, HBM and "core" stay at 45C, however hotspot reads 20C above that at 65C, possibly well into throttling territory.


I'm only using wattman for my vega. I use to use afterburner with RTSS with my R9 290x on screen monitoring but felt it wasn't needed anymore with wattman. I am still using RivaTuner Statistics Server though, but with HWinfo64 instead of afterburner. Once RTSS is set-up turn on the things in HWinfo64 you want to watch while testing or gaming.

Here is some latest information.

I filled in the complete area around the dies using artic alumina thermal adhesive for three reasons. So it's easier to clean for re-pasting, to see if it would pull any heat off the substrate around the dies and to seal the edge of the channel between the dies so the paste in the channel wouldn't be forced out by pressure. The channel between the dies was left open for paste.

So far I have tried the following thermal pastes;

IC Diamond
Kingpin cooling Kpx
MX4
Thermal Grizzy Kryonaut

Kryonaut totally messed up my hot spot temperature and MX 4 wasn't any better. IC Diamond was pretty good all around especially the hot spot temperature. Right now I have the best temps yet. I'm using IC Diamond in between the dies and Kingpin Kpx on the surface of the dies. I also adjusted the mounting bolts for the AIO so there isn't any springs it's now just tighten the nuts until they stop. The reason I like the Noctua coolers, no guess work about the pressure.

If I'm running 3dmark firestrike ultra stress test. My wattage is on average 280W-300W with P7 at 1672-1125mv, P6 at 1642-1120mv and SPPT 220W (+142% Power Limit). GPU clock settles around 1610Mhz. With a 26C ambient my GPU temp is 62C on the GPU and fan on radiator at 1650rpm's. The hot spot temperature while under that load is on average 92C.
 
Last edited:

Zerileous

Senior Member
Joined
Jun 21, 2002
Wow that is a big hotspot delta. We've gotta figure out where it is! My core/HBM are epoxied, here's a pic. For TIM I used the EK-TIM Ectotherm that came with my block.
IMG_20190110_151026.jpg
 
OP
crull

crull

Member
Joined
Dec 19, 2002
Wow that is a big hotspot delta. We've gotta figure out where it is! My core/HBM are epoxied, here's a pic. For TIM I used the EK-TIM Ectotherm that came with my block.
View attachment 205216

I believe it's because of the wattage in addition to the gpu temperature. Someone in another post thought it could be a calculation of some kind using gpu temp and wattage. Not an actual sensor reading. One thing when the temperatures rise like the gpu and hbm they rise slowly because of the cooling, but the the hot spot temperature doesn't seem to do the same thing. It doesn't rise as steady like the other temps.


Can you please run the Firestrike Ultra stress test and get close to the same wattage as in my post and let me know how your hot spot temperature compares to your gpu temp? When the wattage is lower then 250W my hot spot temp is only around 20-25C higher then the GPU temp which seems to be reasonable.

I thought the sensor is between the dies which would make the most sense because thinner pastes don't seem to work very well when between the dies. IC Diamond is very thick
and so far has given me the best performance with the hot spot temperature.
 
Last edited:

Zerileous

Senior Member
Joined
Jun 21, 2002
I just realized you said Ultra, but here is what I got for extreme. Max temps: CPU 73.3 C, GPU 46 C, HBM 48 C, Hot Spot 71 C, Max GPU Chip Power 334W. My estimated loop temperature (measures air exhausted from the radiator and uses an offset to estimate loop temperature) did reach 39 C, the highest I've seen. So definitely more thermal load than the Unigen tests.

For Firestrike Ultra I used what I thought would be a more aggressive fan curve, but it turned out to be less. So that was a bit of an oops, but it turned out OK.
Max Temps: CPU 74.5 C, GPU 48 C, HBM 50 C, Hot Spot 77 C, Max GPU Chip Power 357W. Estimated loop temp 40 C max.

In summary, I too have a 30 C delta for my Hot Spot temp. I really would like to figure out what this temp means, and how it impacts the GPU's performance, and specifically if it causes throttling.

Settings for both runs using the Liquid Cooling BIOS. Frequency is at 0%, which gives P6 1667 MHz & P7 1752 MHz. I gave P6 1150mV and P7 1200mV. Power slider is at +50%, which on this BIOS represents 350W. HBM at 1150 MHz. Max GPU clock was 1722 MHz, and clocks remained fairly stable above 1700 MHz for the tests.
 
OP
crull

crull

Member
Joined
Dec 19, 2002
I just realized you said Ultra, but here is what I got for extreme. Max temps: CPU 73.3 C, GPU 46 C, HBM 48 C, Hot Spot 71 C, Max GPU Chip Power 334W. My estimated loop temperature (measures air exhausted from the radiator and uses an offset to estimate loop temperature) did reach 39 C, the highest I've seen. So definitely more thermal load than the Unigen tests.

For Firestrike Ultra I used what I thought would be a more aggressive fan curve, but it turned out to be less. So that was a bit of an oops, but it turned out OK.
Max Temps: CPU 74.5 C, GPU 48 C, HBM 50 C, Hot Spot 77 C, Max GPU Chip Power 357W. Estimated loop temp 40 C max.

In summary, I too have a 30 C delta for my Hot Spot temp. I really would like to figure out what this temp means, and how it impacts the GPU's performance, and specifically if it causes throttling.

Settings for both runs using the Liquid Cooling BIOS. Frequency is at 0%, which gives P6 1667 MHz & P7 1752 MHz. I gave P6 1150mV and P7 1200mV. Power slider is at +50%, which on this BIOS represents 350W. HBM at 1150 MHz. Max GPU clock was 1722 MHz, and clocks remained fairly stable above 1700 MHz for the tests.

Thanks, your tests make me feel better to know we have around the same delta for the hot spot temperature.

I think I have an idea on how to figure our exactly where the hot spot temperature is.

I might get a can of this.

https://www.amazon.com/MG-Chemicals-403A-Super-Spray/dp/B008UH3NB8


While I am testing under heavy load I will spray certain parts of the card and watch for a reduction in hot spot temperature. This should tell us exactly where that reading is coming from.
 

Zerileous

Senior Member
Joined
Jun 21, 2002
Sounds nuts to me lol. It does say "safe for energized circuits" so maybe it will be OK, but I wouldn't take their word for it. You'll also need to be careful of condensation, if any part of the PCB, cooler, motherboard, case, etc is brought below ambient temperatures by the stuff. Maybe best to grease everything up first as a precaution, but then you have a big mess after.
 
OP
crull

crull

Member
Joined
Dec 19, 2002
Sounds nuts to me lol. It does say "safe for energized circuits" so maybe it will be OK, but I wouldn't take their word for it. You'll also need to be careful of condensation, if any part of the PCB, cooler, motherboard, case, etc is brought below ambient temperatures by the stuff. Maybe best to grease everything up first as a precaution, but then you have a big mess after.

That stuff is how they look for intermittent thermal shorts on circuit boards while running. Sometimes there will be a solder break from heat and when cooled it contracts to make a connection. I didn't think about the condensation, but it would only be in short bursts just to see a reduction in the temperature. It comes with a straw just for that very purpose.

Up to this point I've already taken a lot more chances with the card then I should have so what's one more...lol.
 

Zerileous

Senior Member
Joined
Jun 21, 2002
I watched a nauseating amount of buildzoid videos today trying to find a mention of this hotspot. I didn't notice it but he does tend to drone on, so I might have missed something. From watching the registry video, I confirmed my theory that the card uses a current limit, not a wattage limit, in reality. So decreasing the voltage can help prevent the card from hitting that limit at highest clocks (provided you don't increase that limit instead, which makes more sense to a point if you can keep everything under 60 C. He talks about some absurd limits, but he's also killed some of these cards. Additionally he does mention the 60 C as a point of instability, but that's for GPU. I'm wondering if the "hot spot" is somewhere in the VRM. Why / how would some point in the package be 30C hotter than than either the GPU or HBM?
 
OP
crull

crull

Member
Joined
Dec 19, 2002
My vrm is cooled pretty well and it hasn't made any difference in that temperature.

I also plugged the pump in wrong and didn't know it. I started running a stress test and all the temperatures shot up, but not the vrm temp. The hot spot temp hit 105C and the card throttled. The pump doesn't effect the cooling on my vrm.

"Why / how would some point in the package be 30C hotter than than either the GPU or HBM?"

That's why if there is an actual temperature sensor I think it's located in the space between the dies which has no direct contact with the heatsink.
 

Zerileous

Senior Member
Joined
Jun 21, 2002
Yeah I did another round of googling and the VRM theory got thrown out pretty early. The developer of GPU-Z stated on TPU forums that it's somewhere in the GPU silicon here. We also know that it throttles at 105 C. Is there a benefit to temperatures lower than 105 C (as Woomack indicated, there seems to be benefits in stability and performance for GPU temps < 60 C, even though the card officially throttles at 85 C. Does a similar situation apply to the Hotspot?)

I do recall placing thermal pads over the back of the entire package for my back plate. One reddit user states increasing mounting pressure helps, but I would be weary of doing this overly so, especially with bare dies. I read or heard somewhere that HBM is very fragile, although it is lower than the GPU. Another Reddit user had good luck by providing active airflow over the backplate. Others say thin thermal paste is best. Mod wise I'm getting tempted to try lapping the resin over my dies. Not sure if EK products generally benefit from lapping, mine is nickel coated though.

I did a little experiment and ran Firestrike ultra with the side of and pointed my cheap IR thermometer at the back plate. The laser bounced around the case a lot, the smallest dot closest to the gun is the real one. This was the hottest point on my back plate. Note that the edge of the die (well at least the water channels that are over the die in the water block) corresponds with the rear half of the front fitting. So technically the warmest spot on the back plate is not directly over the die, but actually between the die and the VRM. I can't actually see under the back plate because there is a whole row of thermal pad behind the top VRM row.
IMG_20190331_142316.jpg
 

Woomack

Benching Team Leader
Joined
Jan 2, 2005
I've seen some comments around the web that their hottest spot was near HBM. Actually, memory heats up more than the GPU in some situations and it causes faster throttling. I guess it depends on used cooling. On my cooler, most things have similar temps in GPU-Z.
I had to switch rigs for the weekend and I noticed weird behavior on my card (I wasn't playing games on my Vega before). When the load is going up and down then the card has problems to keep optimal settings and is causing games to freeze for a short period of time. When I thought I set it right then the same happened when I was switching between game and web browser. It can be also related to Vega64 BIOS :) Btw. after flashing Vega56 with Vega64 BIOS I hear coil whine (sometimes much louder than the cooler itself) while on stock BIOS card was nearly silent all the time. I'm using this card for tests only so it's not a problem for me (at least for now).

I would think about water cooling but there is no Vega Nano block available. There were 3 or 4 and all are discontinued except Bykski which is too expensive with shipping, tax and other costs.
 
OP
crull

crull

Member
Joined
Dec 19, 2002
I found with this demo game program I was able to detect overclocked memory issues easier than anything else. If it seems stable give it a try and move the mouse around really fast when in the demo. I would get different colored spots that would flash on and off.

TR Dox Demo

I don't think the hot spot lowers performance unless it gets to 105C which is set in the bios, then the card throttles. I have never seen any reduction in performance when it's been lower then that.

My main problem with the hot spot temp is the lifespan of the card. Maybe there isn't anyway for it not to be so hot when were pushing the power limit higher.
 

Zerileous

Senior Member
Joined
Jun 21, 2002
Woomack, in HWInfo64 there is a "Hot Spot Temperature" that is not currently listed in GPU-Z (I say not currently because it seems to have been present in previous versions, at least to the extent that the developer of GPU-Z was able to comment on it's location). We're finding that at moderate loads (folding, gaming) were finding it runs around 15C above GPU temp, and also that under loads like Firestrike Ultra it can be nearly 30C above GPU temp.
 

Woomack

Benching Team Leader
Joined
Jan 2, 2005
On my card and stock BIOS, everything works in GPU-Z but when I switch to Vega 64 BIOS then some sensors are not working.
I'm not sure at what memory temp throttling starts. I heard it's ~75°C but absolute max temp for 2-3 spots is 105°C at which card supposed to throttle or shut down.
Anyway, on my cooling, I won't make much more. I switch between low power Vega 56 and standard Vega 64 BIOS. Low power gives me good enough performance while is 100% stable and can't hear coil whine while Vega 64 BIOS for benching when I don't care about 100% stability, as long as benchmark is passing.
 
OP
crull

crull

Member
Joined
Dec 19, 2002
I was curious about the hbm height so I took mine all apart.

I tried using some pressure sensitive paper to determine the height difference of the dies but the paper didn't show anything good enough to see.

I think my next step is to try using a copper shim, maybe just over the hbm or a larger one over the entire package as long as it's thin enough to bend.
 

Zerileous

Senior Member
Joined
Jun 21, 2002
I don't know bud. HBM is very fragile, too much pressure and it suffers an internal crack that is not visible but will brick your card. I wouldn't do anything that had a potential to put more pressure on the HBM than the GPU.
 
OP
crull

crull

Member
Joined
Dec 19, 2002
I don't know bud. HBM is very fragile, too much pressure and it suffers an internal crack that is not visible but will brick your card. I wouldn't do anything that had a potential to put more pressure on the HBM than the GPU.

Thanks for telling me that. Maybe that's why the hbm has been reported to be lower in some cases then the gpu die. Could they have done that on purpose so the hbm never gets full pressure from the heatsink? I checked mine as best I could and if it's lower on mine it's not by much.

I little while back I installed this on the back of my card. I used thermal adhesive and glued it to the X bracket with some fujipoly pad under it.

http://www.r-digital.it/index.php?m...ducts_id=650&zenid=8ka3a8s9q0uv9vrij21pjoodh0

I never checked the thermals on it under load until now.

Using my temp gun when the card is stressing running ultra firestrike it's roughly 5c higher then the core measured at the center of the heatsink.
 

Woomack

Benching Team Leader
Joined
Jan 2, 2005
I ordered Bykski water block for my V56 Nano as it's the only one still available in stores. It arrived yesterday so I was checking how it fits. All seems great, quality is about the same as top brands ... except for details. Thermal pads were too thick and after installing the block, there was still a gap between block and the GPU/HBM. There is no English manual and I got a manual for cards based on full length PCB. Supposed to be similar but there are some differences. Anyway, it took me maybe 2h to install it with (as I think) good pressure and contact. I had no time to check how it works, maybe will check it today after work.

Btw. I was looking at GPU/HBM on my card and both seem to be at the same height, actually if not the filling between them then I would say it's one chip.
I've seen some posts around the web and it looks like it's more because of different pressure between various coolers and spots than the GPU/HBM height. If you check coolers then some have additional spots where tighten screws are putting some additional pressure. I can be wrong but I've seen that on my card depends on how tight were screws near GPU/HBM. When were to the max then TIM was spreading like it should but if were a bit loose then HBM area had not enough pressure while GPU side was fine.

Edit:
Everything works fine. I wanted to check how it acts with 120mm rad and after ~80 3Dmark loops (mixed tests) temps without OC were: 49°C GPU, 54°C HBM, 68°C VRM and 63°C hot spot. That's at stock settings with default voltage up to 1.25V. Should be better with a larger rad and after undervolting.
 
Last edited:
OP
crull

crull

Member
Joined
Dec 19, 2002
I wanted to try a full shim over everything, but once I got the h55 off I changed my mind. I checked the height on everything by putting a really thin coating of thermal compound on all the dies then clamped the cooler back on tight. Removed the cooler and noticed that the gpu was making good contact but not the hbm. In fact the hbm was just barely touching the cooler, so with mine they are lower by around 0.1mm. So I then decided even with a risk to try making a shim that would cover just the hbm. I used a fin from a spare copper cooler I had. I sized it to fit over both hbm chips and bent the edge to fit in the gap between the gpu and hbm so it wouldn't move. I sanded and sanded to get it to be as thin as possible so it would match the height of the gpu. I tried it out but the results weren't that great, but I think it was only because it was a little bit too high.

I took the shim off for now and plan to sand it some more then will try it again.


That 0.1mm was not measured I'm using that number because it was mentioned in other forums. AMD acknowledged it and at some point and said it shouldn't affect anything.
 
Last edited: