Why use die simulators for assessing waterblock performance?
Reading around on various forums I see a lot of rhetoric revolving around "real world" vs "artificial" performance. I also see some incorrect assumptions and parallels being drawn. Please allow me to explain the reasons why die simulators are better.
First, let's start with the issues with "real world" CPU's, and just why when we stick different blocks on a CPU, and then we see them all read the same. As we all know the temperatures that get reported by the CPU are being read from thermal probes often located in the coolest portion of the CPU die. A document from Intel details this fact:
here (sadly the document has been removed - rehosted
here). While I have no link to show AMD is doing the same thing, if one does enough testing we can observe a similar pattern of behavior. The location of the diode means that often what it reports is totally "numb" to what's really going on in the hottest portions of the CPU. It works much like the following analogy. If you have a long rod of metal, and you heat it at one end with a flame, and then run cold water over it in the middle of the rod, and then measure the temperature at the other end of the rod, how representative is the temperature reading of how hot the flame enveloped end of the rod is getting? This is precisely what occurs when you stick effective water-cooling on top of CPU's. The CPU may be getting hot, and waterblocks are pretty good at removing that heat, so by the time the heat gets to where the user readable thermal probe is located, you're seeing a "numbed" picture of what's going on in the hottest sections.
Incidentally, as stated in that paper, Intel have a second TCC probe that controls critical thermal shutdown of the CPU if it gets too hot, and that probe is located in the hottest section of the CPU, but the users can never see what it reports. We can still get an idea of what's going on though. Intel states that the TCC probe is specifically calibrated to thermally shut-down the CPU when it reads 135C, and is also responsible for thermal throttling of the CPU. XBitLabs conducted an experiment,
here, which demonstrates the differences between what the user sees via the cool thermal probe, as opposed to what the calibrated TCC probe is doing by slowing/shutting down the CPU at calibrated internal temperatures. This experiment by XBitLabs demonstrates the vast differences between what CPU's report to the user, as opposed to what is actually going on.
Okay, so now that we understand why different waterblocks can report the same temperature on the same CPU, now let's understand why we want to use die sims, and more specifically, bare-die sim (no IHS involved).
When developing waterblock designs, the designer focuses on extracting the maximum possible thermal transfer rate between the metal of the waterblock, and the water flowing through the block. This transfer rate is commonly referred to as the "effective thermal convection transfer efficiency", and is denoted by the letter
h in many engineering texts. Its units are W/m²K, or watts of energy, per unit area and degree kelvin (celcius) rise.
By using die sims of known size, and known even thermal output spread and applied to the base of the waterblock, and by calculating the temperature rise of that die sim we can arrive at a fairly confident approximative value for
h. The higher the value of
h, the more effective the waterblock is at transferring heat into the water flowing through it. When it comes to designing waterblocks for pure performance only,
h is about the ONLY thing that matters. Get
h up high, then tweak the base-plate thickness of the waterblock to suit your value of
h, and you have your high performance waterblock. This is a simplistic description of the design process, but in a nutshell, that's what a waterblock designer/engineer does.
So how does a high
h value help? It tells us how well that the waterblock will transfer heat into the water, regardless of what's making that heat, and whether or not an IHS is involved or not. The higher the value of
h, the lower the required temperature difference between the heat source and the waterblock to move a certain amount of heat energy into the water for any particular unit of area.
Where IHS's come into their own little world of confusion is this. The waterblock may be doing a fantastic job of transferring heat into the water, but the IHS may not be making even contact between the CPU die and/or the waterblock. How many times have we all seen people with very unflat IHS's? Lots. Almost every single CPU has a non-flat IHS because it is a mechanical joint. Variations in the glue that holds the IHS against the CPU packaging, the manufacturing process and the way that the IHS is formed, and cools, thereby causing warping, all adds up to a piece of metal that sits between your CPU and your waterblock that isn't likely to be contacting both evenly.
This is where testing with IHS's, even on die-sims, becomes an issue. Because the IHS can never be guaranteed to be sitting evenly, it will then be applying heat load to the waterblock in an uneven fashion. As you can imagine, trying to determine what a waterblock's
h value is, when there's no guarantee that the surface area that the heat is being applied to is being spread evenly is a total nightmare. This is the sort of stuff that causes people to think that they have a fantastic performing waterblock, especially if they measure only the IHS temperature, and not what's going on at the CPU die level. If you're measuring what's going on at the CPU die level by using a real CPU's on-die diode, then read the second paragraph at the start of this post. It's also telling you next to nothing.
The thing here is that a good waterblock will maximise the convectional transfer co-efficient where the heat is at its strongest. Then it doesn't really matter whether or not people use an IHS, if the waterblock can soak up the heat well, then even though the IHS may be warped you can be sure that your CPU is still be kept cool. Ideally we don't want the IHS there because it can flex, cooling only the edges of your CPU, or cooling only the center and not the edges, and then not transfer the heat evenly to the waterblock either. What is the best waterblock for gnarly IHS's is the subject of another debate, but I can tell you now that IHS's introduce such variability that they can make one block look crap, and another look good. Put the blocks onto a different CPU and the stories can change. That's not consistent enough around which to judge or even design waterblocks with. That's just rolling the dice.
Designers use bare-die's because it is good science, and because it provides the clearest and most succinct indication of how well the waterblock design is performing and how well it is transferring heat into the water over a certain surface area. The use of IHS's, even in testing, and the reliance upon "real world" testing with "numb" thermal probes has heavily clouded this very important point, and that's purely because there are no guarantees with IHS's. They're random. You can form any conclusion you like with IHS's, because every tester will likely come up with a different conclusion on a different CPU. Does that serve the public?
In the end, it's up to the individual to understand the difference between marketing and good science. I hope that some of the above provides further information for people to chew on when considering how relevent "real world" and "IHS bound" tests are.