Great read on what makes memory fly – Ralph Nelson, aka Deep Powder
Problem Definition (Test What?)
Having done significant DDR memory and AMD A64 CPU testing on AMD setups, I was wondering how that experience would relate to my new Core 2 Duo E6600 setup. What would be the dominate overclocking factors? To see I ran a series of tests using SuperPi (1M digits) as a performance metric.
Results were obtained using the following rig:
- Intel Core 2 Duo E6600
- eVGA 680i motherboard
- Mushkin XP2-8000 (4*1Gb) 4-5-4-12 @ 2.3V
- Iwaki MD-20RLZT, D-Tek FuZion, Chiller, Reservoir
- PC Power & Cooling Silencer 750 Quad
Previous Experience with DDR
In the past, memory speed and timings have played a reasonable part in performance. To reach maximum performance, you would overclock both the memory and CPU. With 32-bit processors such as the AMD XP, DDR memory performance and settings had a more significant effect (5-6% max) than with the 64-bit CPUs that followed (2-3% max). How would the C2D with DDR2 perform and if different would the approach to overclocking change?
Several tests were run. In general, the results reported are the average of 5 tests. Exceptions to that are noted as appropriate. Tests included the following:
- Faster Northbridge speeds (FSB) for fixed CPU and memory speed and timings
- Faster memory speeds for fixed CPU speeds and memory settings
- Memory timing variations at fixed CPU and memory speeds
- How much CPU overclock is needed to offset memory speed?
- Biggest gain associated with moving from 600 to 800 to 1000 memory speeds
Results (Faster Northbridge speed, FSB)
You do get better performance with higher CPU speed, see Figure 1 – no surprise here of course. Memory speed and timings are held constant in Figure 1. You can also observe that there is minimal effect of the FSB speed when the CPU and memory speeds are fixed, ie running the Northbridge faster alone doesn’t provide any advantage except at lower FSB speeds.
Figure 1: Pi run time versus FSB for various CPU speeds
Results (Faster Memory Speed)
Figure 2 shows the gains of higher memory speed at a three different CPU speeds that were 3.5 GHz, 3.0 GHz and 2.4 GHz. Memory timings are held constant in these studies. As you can see, higher memory speed does improve performance. But the difference between memory at 800 and 1000 is minimal, on the order of 1.5% or less (see details below).
Figure 2: Pi run time versus memory speed for various CPU speeds
Results (Memory Timings)
Next, info/data on the effects of memory timings is given HERE. An interesting detective story is associated with the initial part of investigation. The table shows results using my Mushkin that is rated at 4-5-4-11. Two sets of tests associated with memory timings are shown. One test ran Pi using 1M digits with 5 samples and the other 2M digits with 10 samples. The 1M case is often used as a reference test by a number of folks including myself.
To make a long story short, the results using 1M digits and 5 samples do not produce results that we would expect. For example, 5-5-5-12 is faster than 4-4-4-11, very questionable result. However if you look at the standard deviations of the tests (called sigma in table), you see that the variability (as represented by sigma) indicates that while the results aren’t what we expect, it is a reasonable result (If you have a collection of data from a Normal Distribution then approximately 66% of the data should fall within one standard deviation of the mean).
Thus 66% of the average values would lie between 14.7689 (average – sigma) and 14.8835 (average + sigma) if I had an infinite number of tests. My result for 5-5-5-12 lies within that spread, so I can’t say it is wrong. Thus my metric as applied to this question doesn’t appear to be good enough! You must always make sure your measurement reflects what you think it does.
From this point, I conclude I required a better metric to measure the effect of memory timings. For this reason, I conducted the Pi test set with 2M digits and 10 samples. It took half a day testing to run those cases in combo with a few “honey-do”s too. Looking at those results in the table, they are more typical of what we expect – ie the ordering of what is important is typical for the various settings.
So what is the maximum improvement that I might reasonably expect with better memory timings? 66% of my maximum performance improvement will be less than 1% for a single setting change to all timings, ie going from 5-5-5-12 to 4-4-4-11. I get this number from this (37.4248+.0.1108)/(37.2565-.08831)=1.0099 or 1%.
How does this relate to better memory? If I had DDR2 rated at 800 and went to memory rated at 1000, using data I’ve not given you but did show in the figure, (15.019+0.017)/(14.854-0.034)=1.0146 or 1.5%. Thus better (faster) memory is better, in this case 800 at 4-4-4-11 will be better than 1000 at 5-5-5-12. Of course, taking bigger steps in memory speed or wider spreads in timings would have to be tested so be careful with extrapolation.
Results (Bang for the Buck Results)
A little more comparative info is shown in Figures 3 and 4.
Figure 3 shows performance as a function of CPU speed for two cases, memory run at ~1066 and ~600, which are the extreme memory speeds I’ve tested. As you can see, at 3 GHz the performance difference due memory speed is ~3.7%, which would require an additional overclock of 100 MHz for the 600 memory to be equivalent. Depending of the cost difference between 600 and 1066, you would have to decide if the performance payback value is worth the extra cost for you.
Figure 3: Pi run time versus CPU speed for 2 memory speeds
Figure 4 shows in greater detail part of a previous plot. From this plot you can observe that if you bought 800 memory, you would get 2/3 of the potential gain between 600 and 1066. Thus 800 might be a good option for you if the price difference for the 1000 speed is too high.
Figure 4: Pi run time versus memory speed emphasizing gains by 600, 800 and 1000 memory speeds
Looking at the table noted above, you can see that buying memory with 4-4-4-N timings would give you a gain of ~0.5% in performance over memory with 5-5-5-N timings. Again, you need to decide value.
From these studies , I conclude:
- CPU speed is most important and greatest payback
- Higher memory speed is next
- Tighter memory timings (speed and timings usually compete with one another, as speed gets faster the timings get slower)
- FSB (faster Northbridge) alone has minimal value
Conclusion & Comparing DDR2 to DDR
From my previous experiences with DDR to my current experience with DDR2, I find the newer memory combined with the newer CPUs and motherboards to be much easier to overclock. Basically with my current setup using DDR2, I can plug the memory into the motherboard, then go ahead and overclock the CPU while keeping the memory near its rated speed and timings.
The memory speed and timings can be further refined, but payback for your efforts are limited to a percent or two in performance – something you’ll never see in day-to-day work. You can also spend more money on faster-tighter memory, but above memory speeds of ~800Mhz the payback appears to be debatable (realize that I’ve not tested the super fast memory available today, but I see no reason to expect significant benefit with my current CPU and cooling).
Check out Ralph’s website HERE.