Excellent theoretical exposition – Antoine Dechaume
Note: This article is an English translation from a French version which is available HERE.
These articles intend to try to provide everybody the basic principles
involved in heat transfer and cooling related to processors. These
principles are not particular and limited to CPUs – they can be applied to any device that produces heat, but here we will only consider the CPU.
The scientific background necessary to understand these articles has been
reduced to a minimum, trying to keep the coherence of them as well as to
address topics – and myths, untruths, … – frequently seen on forums and
websites. To this end, many simplifications and approximations are done
relative to hard theory that could be found in literature.
I tried to do my best to write articles that are at the same time precise,
complete and easy to understand, but nobody’s perfect, so feel free to mail
As a general rule, everything that consumes electricity heats up, this is called the Joule effect. That’s the electric energy which changes into another, thermal energy or heat (which is measured in Joule J). A processor does not escape this rule – unfortunately, it generates heat.
Rather than heat, we use thermal power to measure this phenomenon, which is the quantity of heat released during one second (in Joule, by second J/s or Watt W). Thermal power is a significant parameter, which is the one that is conveyed along the path: processor – > cooling system – > ambient air. If a processor releases a thermal power P, the ambient air ultimately receives that same thermal power, P.
For a given processor, it depends on three parameters:
- The electric voltage provided to the CPU
- The frequency
- The load
The released thermal power is proportional to the frequency (if each CPU’s cycle needs a certain quantity of energy, the more there are cycles in one second, the more the power dissipated by Joule effect is significant ), to the load and to the square of the voltage. If these parameters increase, thermal power increases. Here things are getting worse, because one always seeks to obtain more megahertz!
Furthermore, to stabilize the processor at a higher frequency, increasing the voltage is essential (to improve the ratio signal/noise, electric noise being a byproduct of heat), which causes the CPU to produce an even more significant amount of thermal power!
A given CPU will work at a given frequency only if its temperature is lower than a certain value, and the higher its frequency is, the lower this value is. But going down very low is useless – the silicon of the processor becomes insulating (nevertheless, it’s a long way to get there). Thus temperature a very sensitive factor, and for this reason one wants to cool.
We will start with the basic concepts such as conduction, convection and a little bit of radiation, just to say that it exists because it plays only a negligible role for what concerns us. Next, basic concepts about the fluid’s interaction with cooling system components will be addressed, including pressure drop, fans and pumps.
We will then see the application of these principles to systems used daily to cool processors, like heatsinks and waterblocks, heat exchangers, phase change systems (refrigeration, heat-pipe, evaporative cooler/bong) and finally TECs. The goal of these articles is to have the necessary basis to describe the principles of the various cooling systems available.
In all the following articles, we assume that physical phenomenon coming into play are stationary, ie none of the parameters considered will vary with time and the system is at equilibrium. Of course, reality is not like that – the processor will generate more or less heat according to whether it is more or less loaded. But then why neglect time related variations?
Well first, because it greatly simplifies things and second, because it roughly does not change anything in the final analysis. It is just enough to be concerned with what occurs when the CPU is fully loaded, when it generates the most thermal power, because what is valid under these conditions will also be in others.
Heat is not treated alone without reason, it propagates from hot to cold parts and tries to restore the thermal equilibrium; in other words, so that the temperature is uniform everywhere. For all the materials (solid or fluid), this phenomenon results when the body is motionless, in what is called Fourier’s Law. In order to be able to establish the principal characteristics of conduction, we will consider the simplest form of Fourier’s Law, applied to a very simple geometrical form – a very simple plate, which is heated uniformly on one of its faces.
We thus take a plate of some material, of thickness L (in meters, m) and with both greater surfaces having the same area A (in meters squared, m²) but having different temperatures. Other surfaces are supposed to be assumed thermally insulated so that all the heat flow which enters one of the large faces, exits from the other. Once again reality is different, the side faces will be generally exposed to the ambient air which will absorb heat, but so little that we can, without problem, adopt this assumption without modifying the range of what follows.
The simplified version of Fourier’s Law gives a relationship between the thermal power P (W) which crosses the plate, the temperature difference between the hot face Th and the cold face Tc (in Kelvin K or Celsius degree °C; it doesn’t matter, as we will consider temperature differences), the surface A (m²), and the thickness L (m).
This law postulates that the thermal power is proportional to the temperature difference Th – Tc, to the surface area A, and inversely proportional to the thickness L. The proportionality factor is called thermal conductivity k and is thus measured in W/(m K); it generally varies with temperature and location, but for what we are concerned, these variations are very weak and we will consider thermal conductivity as a constant. It follows:
The following table presents the values of thermal conductivity for the most used materials, at ambient temperature:
Thermal conductivity in W/(m K)
Let’s look more closely at this formula; imagine that the lower surface of the plate is heated with a power P (as could have been done with a CPU having the same surface area), the temperature of the hot face is then:
Thus if Tc is given (to fix ideas one can take ambient air temperature as a reference); in order to have the lowest temperature Th, it is necessary to have the lowest thickness, the largest surface area and the highest conductivity. Concerning thickness and surface area, we will see a little further on that things are not that simple, this simplified Fourier’s law is rather limited.
On the other hand for thermal conductivity it is always true; the higher it is, the better it is. Diamond and silver are the most effective materials but copper has the best conductivity/price ratio, silver is not really of interest considering the slight difference compared to copper and its prohibitive price.
Some physical laws express phenomenon which have similar behavior, this is why an analogy with electricity is used: thermal resistance. It states a relation between the temperature difference (electric voltage) and the thermal power (current intensity), the Fourier’s Law is the thermal equivalent of Ohm’s Law:
Thermal resistance is measured in °C/W . That’s a very significant data because it constitutes the only usable mean of measurement in practice and all heat exchanges will be expressed through this concept. Thermal resistance enables one to measure the effectiveness of a cooling system, this is the data at which it is necessary to look. Keep that in mind while waiting for the part on convection to know some more.
Naturally, we are looking for the smallest possible thermal resistance in order to have the coldest temperature Th (nearest to Tc in fact). For the plate we see that
and we can make the equivalent electric diagram:
Now let us consider 2 plates one being above the other, always presumably insulated on the sides, with conductivity k1 and k2, thickness L1 and L2 but with the same surface area:
For each plate, we independently have:
Ti being the temperature of the interface between the two plates and R1, R2 thermal resistances of each plates.
Summing both, we obtain:
We then obtain a relation which is valid for the two plates and write:
It works like in electricity for resistors in series, resistances are added!
With these two contacting plates, the formula used previously supposes they are in perfect contact, which would be the case if the two surfaces which make the interface between the two plates were perfectly flat and smooth. Once again, reality is cruel but here we can’t neglect it! Surfaces in contact are not smooth nor flat, they always have a rough pattern which looks like this:
The effect of roughness is very important; in general, the real surface contact area is never higher than 2% of the interface surface area – yes 2%!!! This leaves enough places for micro cavities filled with air, which has a very bad thermal conductivity (see the table of conductivities above). All this thickness (0.5 – 60 μm for flat surfaces) consisting of roughness and air gaps form a thin layer which will conduct heat less; this translates into contact resistance Rc, which will depend upon:
- the form and the distribution of roughness (ie surface finish)
- the thermal conductivity of the air -or of the material- which fills the gaps (higher is better)
- the hardness of materials and the contact pressure between the two plates (softer materials and higher pressure allow to “crush” more easily roughness and thus offer a greater real surface contact area)
- the apparent interface surface area (a larger interface offers less resistance)
This resistance is added to the thermal circuit:
The means to reduce the contact resistance are well known: Lapping to obtain the flattest and smoothest surfaces’ quality, as the temperature can be reduced up to 4°C (thanks to Bill Adams), replacing the air cavities by a substance of better thermal conductivity, like thermal paste, and increasing the mounting pressure by tightening the assembly as much as possible (but in practice within the range of what a CPU supports!).
Let us complicate things a little more: Let’s consider the case of two plates in contact having different surface areas and look at the effect on the second plate.
It is useful because, as we saw at the beginning of this article, increasing surface area decreases thermal resistance. But there is another factor coming into play which translates the weakness of the ‘light’ Fourier’s law version : Heat is not spread ‘uniformly’ on the upper second surface A2; thus, although the latter being larger, heat exchanges are more significant in the center, as if only a part of the second surface was used.
This reduces the interest of increasing the surface area! This fact translates in what is called the spreading resistance Rs, which depends on the contacting surface areas ratio, the thickness of the second plate and the heat transfer on the top of surface A2.
Why thickness? Because it is necessary ‘to reduce the lateral thermal resistance’ so that heat can be better distributed, the larger the thickness, the smaller is Rs. This is the reason why cold plates are used, at the base of a heatsink or waterblock between a CPU and a Peltier.
Once again, this resistance is added to the others:
Those who followed up to here will think: Eh, thermal resistance increases with the thickness, but here it decreases?! Indeed, choosing the thickness of the second plate is not obvious because if it is larger, on the one hand R2 decreases, but on the other hand R2 increases. There is a compromise so that R2 + Rs is the smallest possible. This ideal value of thickness depends on the whole design of the cooler, but generally in CPU cooling, beyond a thickness of 10 mm one loses more than one gains…
We thus saw the factors that come into play in thermal conduction, translated thanks to the concept of thermal resistance which is the only parameter that makes possible the expression of the performance of a cooling system. As a summary, we can make a brief recap of these resistances:
- R: Pure conduction, related to the thickness of materials
- Rc: Contact, related to the state of surfaces in contact
- Rs: Spreading, related to the difference of the surfaces areas between the devices in contact
In the final analysis, what will be necessary to keep in mind for the application of interest, ie cooling a processor, includes only the case of a plate in contact on one side, where the CPU’s surface area is smaller and has given ‘state’ (the pioneering days of lapped CPUs seems finished), just as the surface of the plate has a controllable but imperfect state. We can thus write the relation which will give the temperature of the CPU:
Here Tx represents the temperature of the upper face of the plate (generally taken equal to the average temperature), but what is the meaning of this x? Eh, well quite simply that this temperature is, a priori, unknown! The plate only constitutes a part of the cooling system: The base plate. The determination of this temperature requires knowing what occurs above the plate, this is the subject of the following article: Convection…
Using the movement of a fluid to cool is a natural idea.
In winter for example, the more the wind blows, the more it cools – similar to blowing on a hot coffee cup or a soup spoon to cool it faster. These illustrations highlight what will be considered: The fluid movement effect on heat exchange or convection heat transfer.
It is roughly the same equation as for Fourier’s Law, but some ingredients have changed. Newton’s Law gives a proportionality relationship between, on the one hand, the thermal power P given by the solid to the fluid, and on the other hand the difference between surface temperature Ts of the solid (the plate which conveyed the heat up to here) and the temperature Tf of the fluid, as well as the area S depending on the fluid-surface contact.
The proportionality factor is h (in W/(m² K)) and is named the convective transfer coefficient, Newton’s Law is thus written:
Generally, surface and fluid temperature vary according to where we’re looking at, so it is appropriate to specify what is understood by temperature. In this relation, they are generally defined in the following way:
- Ts is generally an average temperature over the solid surface
- Tf is generally an average temperature of the fluid in contact with the solid
We now can express the thermal resistance associated with convection:
What consequences can we draw from this expression? Well, we always seek to make it as low as possible. Obviously, it is desired to have a high convective transfer coefficient and a large surface area. Concerning surface area, nothing special except that we will see later how to increase it with fins.
Concerning the convective transfer coefficient, things become complicated! If we want to understand how it is influenced and how to increase it, it is necessary to understand what occurs in the fluid and how it is characterized.
Fluid characteristics are defined by two types of parameters :
- Physical properties of the fluid
- Nature of the flow
The properties shown will generally vary with the temperature (inter alia), but these variations are negligible for what concerns us and we will assume that these properties are constant:
- Thermal conductivity k (in W/(m K))
- Specific heat c (in J/(kg K)), which expresses the fact that one kg of air contains less quantity of heat than one kg of water, when both are at the same temperature
- Density ρ (pronounce “ro”, in kg/m 3), which expresses the fact that one cubic meter of air is less heavy than one cubic meter of water
- Viscosity μ (pronounce “mu”, in kg/(m s)), it expresses the fact that it is harder to move water than air
A small table to recapitulate these values, taken at ambient temperature and pressure:
You can already notice that the physical properties of water have higher values than air. These values will be commented upon later, when we will be able to connect them to the convective transfer coefficient.
The flow regime can be either laminar or turbulent, depending, for a given fluid, on the cooling system geometry in contact with the fluid and on the fluid velocity. For a given cooling system, the flow will become turbulent beyond a certain velocity of the fluid and it will be more especially as velocity increases.
The flow is laminar when it is possible to identify streamlines in the fluid, which will slip one on each others (caution! not without friction); for instance, by injecting dye in the flow, we would see it moving gently while becoming slightly deformed.
A flow is turbulent when it is no longer laminar; that is to say, we don’t see these streamlines any more; the dye is diffused, mixed, chaotic, and nothing is easily identifiable except the presence of vortices. This latter is one of characteristics of turbulence, which shows a whole set of vortices having very different sizes.
This characteristic results in one of the properties of turbulence which concerns us in heat exchange: Mixing.
We will return again to this topic…
The following picture shows the differences between laminar and turbulent flow, for the latter, the dye is much more mixed and homogenized:
Now that we have in mind how a fluid is characterized, it is still necessary to dig a little more to understand how the convection coefficient h behaves. In what follows, we will see what happens to the fluid close to the solid’s surface, as it is here where everything of interest will occur. This zone bears the sweet name of the ‘boundary layer’.
In fact, it should be plural – there are two types of boundary layers. One which always exists, the dynamic boundary layer, is related to the movement of the fluid. The other one will only exist when there is heat exchange between the solid and the fluid, the thermal boundary layer.
Here are the facts: The fluid particles’ velocity in direct contact with the solid is null (because of viscosity), whereas this is not the case in the middle of the flow (where generally velocity is maximum). Velocity won’t suddenly go from zero to its maximum, but it will be achieved gradually, and this occurs in this boundary layer. There, the viscosity effects of the fluid, as well as friction, take place.
The following picture shows a laminar dynamic boundary layer; it was shot a moment after having injected dye on the left vertical black line:
Here are the facts: The fluid particles’ temperature in direct contact with the solid is equal to the solid surface’s temperature, whereas this is not the case in the middle of the flow (where generally temperature is minimal when the fluid cools the solid). Temperature won’t go suddenly from its value at the surface to its minimum, but it will be achieved gradually, and this occurs in this boundary layer. Here is where the effects of the fluid’s thermal conductivity take place.
The following picture shows a laminar thermal boundary layer, it was obtained by differential interferometry (the heated fluid has a density and an refraction index which varies):
In order to further explain this coefficient, and for ease of understanding, we will assume that the two boundary layers coincide. We also will assume that the fluid is slowed down so much there that we can consider that the heat exchange is done by pure conduction.
Under these circumstances, the fluid will behave, from the thermal point of view, like a solid and the heat transfer will obey Fourier’s Law. If δ is the thickness of the boundary layer (pronounced “delta”), Ts the temperature of the plate and Tf the temperature at the border of the boundary layer (further, the fluid temperature roughly doesn’t change); then we can write Fourier’s Law for conduction in the boundary layer:
Here S is still the contact surface area between solid and fluid and k is the thermal conductivity of the fluid. Thus thermal resistance can be written:
If we remember the expression of the “true” thermal resistance for convection, we can finally write:
Lastly, something to chew on in connection with h! We were looking for high values of h. We can thus interpret (and that’s all we can do, since this relation is just a trick to understand qualitatively) that the lower δ is and the higher k is, the more h is increased. Concerning k, one would have suspected it, an even more conductive fluid is better!
Concerning δ, it means we will seek to have the smallest possible boundary layer thickness. This is the case as the fluid velocity is higher and even more as the flow is turbulent. Why does turbulence improve convection? Thanks to mixing, which will homogenize temperature so that temperature variations will be pushed back closer to the solid surface.
Just a little comment: Strictly speaking, it would be necessary to talk about thermal conducto-convection, because, as we just saw before, heat exchange between a solid and a fluid simultaneously involve the phenomenon of conduction and convection, both being closely coupled.
But the convective transfer coefficient also depends on viscosity, on density, and on the specific heat of the fluid. Unfortunately, to analyze how these properties will influence h, it would be necessary to return to a more theoretical analysis – this is not the goal of these articles.
Forget the influence of viscosity – that basically does not change the range of conclusions made here, but it will return when we’ll consider pressure drop. Regarding specific heat and density, we can dig a little more with the help of another law: the First Principle of Thermodynamics. This principle, whose usefullness is of greater order than the study of h, will enable us to conclude our review of the thermal convection.
Considering its most simplified version, it establishes the relationship between:
- The thermal power P absorbed by the fluid in contact with the solid
- The difference between the temperature at the output To and at the input Ti of the cooler (hsf or wb)
- The volumetric flow rate D (in m 3 / s)
- The specific heat of fluid c
- The density of the fluid ρ
The thermal power P is proportional to all the remainder, and that gives:
P and Ti are given and do not depend on the cooler considered (P depends on the CPU and Ti from what occurs before the input). We are looking for the coldest solid surface; therefore the fluid temperature should also be the coldest possible along its course. Thus it is necessary to look for To closest to Ti. Writing this first principle of thermodynamic like that:
Then a better flow rate D, a better density ρ and a better specific heat c is desired (actually the product of both). Concerning flow rate, this joins what we saw thanks to the interpretation with Fourier’s Law, ie the more it is, the more velocity also is. With regard to the two remainders, density and specific heat, it depends entirely on the choice of the fluid, since they are physical properties.
The last thing in connection with this principle: Let us compute To – Ti and you will notice that this difference is rather small, generally smaller than 1 °C for water with the usual flow rate of a water cooling system.
Two angles of attack arise naturally: One relates to h and the other to S, even both simultaneously. It is a question of increasing both in order to decrease the convection thermal resistance.
The “Simplest” method is to increase the area of the plate with a bigger one, but this is limited by the fact that the spreading resistance also increases, with its related drawbacks. The best method to increase contact surface area between fluid and solid is to use fins, and the vast majority of coolers are provided with such. Rather than adding surface area while spreading out horizontally, it is done vertically.
On the practical and theoretical side, the effectiveness of a fin will primarily depend on its thickness and height. Each part of the fin surface will not exchange the same amount of heat, since the temperature decreases from the base to the top of the fin. This results in a coefficient (efficiency, ranging between 0 and 1) which will balance the fin-fluid contact surface area, so that the fin’s heat transfer with the fluid occurs on an effective area smaller than the real contact area.
Generally, beyond the point where the temperature of the fin is nearly identical to that of the fluid, the remainder of surface exchanges virtually no heat to the fluid and is thus useless. Moreover, the higher the fin, the more the volume of the cooler, and if the flow rate is constant, then its velocity decreases, also the thermal convection… it’s a matter of compromises, as in most situations.
They relate primarily to h but can also have a slight positive impact on the heat-transferring surface area. They generally consist of mechanisms to “break” the boundary layer in order to reduce it, to disturb the flow so that mixing is increased as well as turbulence. In practice, this is done by tweaking the geometry and the finish quality of the heat-transferring surfaces. Rough surfaces and the presence of turbulators will achieve this task.
But always keep in mind that the benefit in performance obtained thanks to these improvements will be paid on the other hand by pressure losses which will be addressed further… compromises…
Here is, as a conclusion, the summary of all that was seen and of what is to be kept in mind to understand convection. The convection thermal resistance Rcv is given by:
To reduce this, it is necessary to have the largest effective heat-transferring surface area S and the highest possible convective transfer coefficient h. Therefore, this means to look for a flow having:
- a high fluid’s thermal conductivity
- a high flow rate…
- …even better, a more turbulent flow
- a high fluid’s density and specific heat (in fact, their product)
Recall of the physical properties of water and air:
In the light of that, which is the most interesting fluid?
For this reason, the interest in watercooling;
To finish with heat transfer, let’s establish the link between convection and conduction seen in the first part of these articles.
Remember, thermal power leaves the surface of the CPU, goes to the base of the cooler through the interface between these two surfaces with the corresponding contact resistance, Rc, to arrive at the upper surface of the base with also corresponding resistances, pure conduction Rc and spreading Rs.
Rcd, the sum of all these resistances, contributes to the effects of thermal conduction up to here.
Then the thermal power is absorbed by the fluid, with thermal resistance Rcv. It is still added to Rcd to finally give the total resistance R between the CPU and the fluid. This total resistance is defined by:
Where Tcpu is the CPU’s temperature, Tf that of the fluid (practically the temperature at the entry of the cooler) and P the thermal power given by the CPU to the cooler. The temperature of this latter can also be written :
I point out the importance of thermal resistance, it is the only parameter which allows to measure the thermal performance of a cooling system and this independently of the thermal power (thus, independently of the CPU, provided it has the same surface area) and of the temperature of the fluid at the entry (thus, independently of what is between the input and the output of the cooler considered).
This thermal resistance of a cooler will be roughly the same if the CPU is changed or if the inlet temperature changes. Note that the same cooler mounted on a CPU having a different surface area won’t have the same thermal resistance R. This fact is due to spreading and contact resistances, and on a bigger CPU, R is generally lower.
This chart shows what kind of curve can be obtained for thermal resistance R with an Innovacool Rev. 3 waterblock (thanks to Bill Adams):
Knowing the thermal resistance of a cooling system is quite pretty, but that’s not enough to know what it’s really worth…
As we seen before, the total thermal resistance of the cooler will vary with the flow rate – this is mainly due to convection through Rcv. Therefore, when performance data is defined, it is necessary to specify the thermal resistance with respect to the flow rate. In fact, this only applies to waterblocks, as for air cooled heatsinks (which are generally sold with a fan), the flow rate is fixed. The effect of the cooling system on the fluid movement is only related to waterblocks.
Imagine two different waterblocks which have exactly the same thermal performance (same resistance R) at any flow rate! How to decide between them?
Because of fluid friction on the waterblock’s surface, of the internal circuit layout (curve, elbows…) and because of the variations in cross section (input, output…), a considerable part of the power that the pump provides to the fluid will be lost. The consequence is a flow decreased by the presence of the waterblock, and thus less effective convection heat transfer!
These losses, which are reflected in the flow rate, are measured with what is called the pressure loss. In practice, we will thus look for a waterblock which causes less pressure loss, and this will enable us to decide between two waterblocks having same thermal resistance.
Thermal resistance and pressure losses are the two inseparable and indispensable means of performance measurement of any waterblock.
Pressure losses and their influence on the fluid circuit will also depend on all parts of the circuit. I refer you to the next article, which will deal with all this.
Another type of heat transfer with conduction and the convection, thermal radiation, doesn’t need any material, solid or fluid, to exist. Even in a vacuum, heat can be exchanged by radiation.
Thermal radiation is closely associated to electromagnetic radiation, which can be described as material energy (heat) converted into light energy “type” and conversely. Light here does not mean necessary visible – thermal radiation also occurs in the infra-red spectrum.
This type of heat transfer will depend on the temperature and type of materials of the various elements considered. In the case which concerns us, the radiation of a given waterblock or heatsink will thus depend on the remainder of the environment which it “sees” (case, PSU, motherboard…).
In fact, relative to the other types of heat transfer, radiation will be significant only when there exist significant differences in temperatures between these elements. The sun (5900 °C) and fire (2800 °C) heat by thermal radiation; in these two examples, temperatures are significantly different compared to the ambient temperature. The good news here is that concerning CPU cooling, the heat transfer by radiation is negligible, thus let’s ignore it…
Voila! That’s all folks, while waiting for the next part about pressure losses and influence of the fluid circuit. I wish to thank Deejeecee, Christophe, Rosco, Karamilo and Salmatt for their returns on the preliminary versions of this article, and Bill Adams for his “polish” on the English translation.
Tags: Systems & Components