As you all should know by now, I have been on a mission to exploit the performance potential of Gigabyte’s P55 motherboards. I started by using the P55-UD3R to break four world records using Intel’s i5 750 quad core CPU. Please feel free to visit my full overclocking review of the P55-UD3R here. I found it to be very impressive for a sub-$150 mainstream motherboard. As you can imagine, when Gigabyte told me they were already working on an update to that formula with the new P55A-UD3R, I was very interested to see the result!
Two weeks ago, Gigabyte Japan delivered on that promise and sent me the new P55A-UD3R, and an added bonus, the P55A-UD6. At first glance I could tell the changes were more than updated marketing. The boards have a few new features that are very interesting indeed. The biggest of which would have to be the inclusion of a SATA 6 GB/s controller, and a USB 3.0 or “SuperSpeed USB” (also known as “SSUSB”) controller. Also readily apparent on the UD3R was an upgrade in the power delivery from 8 phases for the original, to 12 phases on the new version! I was very eager to see if the added phases of power circuitry would boost the overclocking potential, and I wasted no time finding out! Right off the bat, and with only 4 liters of liquid nitrogen to play with, I was quickly able to push my i5 750 CPU to new heights and break the elusive 100% OC barrier. That means I was able to double the default speed of 2660MHz…check it out!
After that quick test though, I had to put a damper on my excitement and focus on the questions all of you are reading this to answer, first and foremost, how is SSUSB? So, for the next week, I spent countless hours, going back and forth with Gigabyte, working my way through testing this new technology. Part 1 of my SSUSB testing can be read here. Today, I want to provide some follow up for my previous testing by taking SSUSB to it’s limit!
Part 2 – SuperSpeed USB at the Limit
So, how do I plan to take this technology to the limit? Here’s a hint:
Thats right, I have two of Intel’s latest 160GB solid state drives. Hopefully these will allow me to show you just how much potential this technology has. I’ve been debating how to best present you with this data. I thought about running the exact same tests as before with these Intel drives, just correlating the data, and drawing my conclusions based on the results. But, I felt like that wasn’t really enough. So, I’ve decided to do some real world testing, and more.
Gigabyte’s Controversial “USB3.0 Turbo” Mode
As I mentioned in my initial testing, there seems to be a lot of debate concerning Gigabyte’s design decisions. Let me refresh your memory briefly. Both the SSUSB controller, and the SATA3 controller can communicate with the CPU in two different ways, and the user can choose their preference. If maximum performance of these devices is your priority, you can turn on “Turbo” mode in the BIOS. This will interface the controllers with the CPU via one of the 16 PCIe 2.0 lanes on the CPU die itself. The downside of this approach is that your PCIe graphics card bandwidth will get knocked in half (although that’s usually still plenty). If you need fast, but not the absolute fastest SSUSB/SATA3 performance, and you prefer your graphics card to have maximum performance, you can turn off Turbo mode in the BIOS, and the SSUSB/SATA3 controllers will interface with the CPU via a PCIe 1.1 lane in the P55 chipset via the Direct Media Interface (DMI). As we saw in my initial testing, the Turbo mode did have some good performance advantages. The one situation in which the user will not have an option is when running dual graphics cards. In this configuration, the system will run 8 PCIe lanes to each graphics cards, and force the SSUSB and SATA3 controllers to use the non Turbo interface.
The reason this approach has been so controversial is because some of Gigabyte’s competition have announced a different option for implementing both the SSUSB and SATA3 controller on their motherboards- which claim to not compromise on anything. This is done with the use of a “bridge chip”. A bridge chip basically converts the bandwidth of the four PCIe 1.1 lanes on the P55 chipset, into two PCIe 2.0 lanes- one for each of the controller chips. Despite the claim of a no compromise solution, there are a couple to consider in this situation. All devices connected to the P55 chipset; NICs, SATA devices, audio chipsets, USB controllers, ieee1394 controllers, etc…share the DMI to communicate with the CPU. This is a potential bottleneck. The other downside is that these bridge chip solutions add additional cost to the product.
My goal with this follow up testing, is to answer the following questions:
- What will it take to find the DMI’s limit?
- Will turning on Turbo mode actually help in a situation where the DMI is saturated
My system hardware will be virtually identical to my previous testing. However, I’m doing this for the enthusiast, so I felt it appropriate to attempt to simulate a more realistic configuration. Last time, the OS was stripped down, I had all unnecessary devices disabled in the BIOS, and I didn’t load any unessential software. This time I will configure the system as if it was a daily rig. I’ll be using Windows 7 x64 Ultimate in it’s default configuration. I’ll also leave all devices enabled in the BIOS. Additionally, I’ll have the system connected to the internet monitoring for updates and with anti-virus keeping a watchful eye over things. Here is a re-cap of the system specs, with changes highlighted:
- Gigabyte GA-P55A-UD6 motherboard (F6 BIOS)
- Intel Core i7 870 quad core CPU at 4.4GHz, water cooled*
- 2x 1GB Corsair Dominator GT at 1000Mhz (2000MHz DDR) with 7-8-7 20 1T sub-timings
- 1x 160GB Western Digital 7200RPM HDD running off the Intel ICH10R
- 2x 16GB MTRON Pro SSD in RAID0 running off the Intel ICH10R
- 2x 160GB Intel X25-M SSD, one running off the Intel ICH10R and the other running off the SSUSB controller
- 1x Gigabyte GTX 260 with nVidia Forceware 191.07 drivers
- PC Power and Cooling 750W power supply
- Windows 7 Ultimate 64bit
So, as you can see, I’ve updated the motherboard BIOS to the latest release. I also updated the SSUSB drivers to version 22.214.171.124 released December 3rd. Because I wanted to exploit any bottleneck in the system, I reduced the memory from 4GB to 2GB. This also is more likely to reflect a wider range of users systems right now, instead of only the most cutting edge users out there. The rest of the changes are fairly self-explanatory.
* It’s getting cold outside, and I have my radiator positioned in my window which explains my very low ambient temperatures.
With access to these amazing new SSDs from Intel, I owed it to myself to spend a few minutes testing PCMark 2005. So I swapped out my CPU for the stronger i5 750 (PCMark 2005 does not show much improvement with Hyper Threading technology), and popped in a second graphics card. Then I configured the two Intel drives in a RAID0 for the PCMark 2005 HDD tests. For those of you who do not know, PCMark has been found to be extremely HDD performance bound. With conventional HDDs it is quite difficult to score past about 15K points. Most users scoring above 20K points are using some form of RAM drive, like the Gigabyte i-RAM, or the new ACARD ANS-9010 drive. As you can see here, I’ve been able to score a very respectable 23K points with my humble configuration, using only water cooling.
So, to answer my first question, I needed to figure out a way to attempt to saturate the DMI bus. Nearly every peripheral in the system runs through it, and that makes things difficult for me as a bencher, as I usually disable all unnecessary devices. In this case, I would normally disable the USB3.0 controller, the IDE controller, the Gigabyte SATA controller, both NICs, and a lone serial port. Usually, I would use a custom OS image that is stripped of all unnecessary drivers, applications, services, and security settings. Normally, I would further refine my settings after OS installation by disabling “Windows Search”, “Windows Firewall”, and many other services that provide no benefit while attempting to get the highest scores in any given benchmark. But this time, I did things differently. I loaded a fresh retail copy of Windows 7 Ultimate 64 bit. I didn’t streamline or modify anything. I installed McAfee Anti-virus 8.7. I configured Automatic updates, just like I do on my gaming/office PC. I installed all the Gigabyte software included with the motherboard. My goal with this was to create a system configuration very similar to many of your systems out there.
So, how does this saturate the DMI bus? Well, I’ve connected five storage devices to the system, two MTRON SSDs in a RAID0 array for the OS, a single Western Digital 160GB drive for storage, and the two Intel drives; one connected to the Intel ICH10R via SATA2, and the other is the tested device, either in the SSUSB Buffalo device, or plugged into the ICH10R for the SATA tests.
- During each one of the test runs, I’ll have a file transfer running in the background. The transfer will be from a GbE network location to the WD storage drive running at 20-22MB/s sustained rates.
- I’ll also be running an instance of PCMark05, which will be running the “HDD General Usage” benchmark on the lone test subject Intel SSD.
Both of these things will be happening while running through the testing suite. The only other major change from my previous testing, is that I’ll be running each test three times to ensure consistent results. Also, in addition to ATTO and HDTune, I also ran CrystalDiskMark by popular request. Here is a screenshot of a test in action, showing what my screen actually looked like while testing.
So, with all of this data flying around, I’m eager to see if the standard SSUSB (the non-Turbo version using the P55 interface) will show any signs of slowdown. Since I’m not loading the CPU, I think we should be able to safely assume that any possible slowdown would indicate saturation on the DMI bus. And if that happens, we should be able to conclude that any other manufacturer using a “bridge chip” would run into a similar bottleneck. Let’s see what happened…
We’ll start off with the baseline. This testing was done with all of the extra stuff in the BIOS turned off, but with no other changes to the configuration as described above. I tested the SSUSB with and without “Turbo mode”, and also with the Intel drive connected via SATA2 for comparison. Here are the results.
Starting with ATTO Read speeds: It would appear to me that all tests were affected by the additional data transfers (I’ll refer to those tests as “dirty”), but with an influence more obvious in larger files sizes with the non-Turbo SSUSB interface. Notice also the small to mid size files being affected quite a bit on the dirty Turbo tests.
Next up, we have the ATTO write speeds. What’s this? I actually got better results during the dirty testing…although nothing major. There is a definite drop in performance in the 16K and 32K write speeds with the non-Turbo clean testing.
Next test: the HDTune read speed testing. The SATA2 tests were a little inconsistent here, but this really showed what was happening with the non-Turbo SSUSB testing. As you can see in my chart below, minimum read speeds dropped tremendously, and average speeds showed significant effects. But the most interesting data is in the actual test screenshots…
Here is a screenshot of a clean HDTune run with the non-Turbo SSUSB interface:
Here is the screenshot of the same test, with the same interface, but with the other transfers occurring simultaneously.
Lastly, the new test: CrystalDiskMark. In this testing, all results were virtually identical in all but the SATA2 4K write testing where the clean test was more than 2MB/s (or about 3%) faster than the dirty test. Far from conclusive.
I think my last comment sums up my feelings about all this testing. “Far from conclusive”, which is really disappointing after all the time I put into this. However, was I able to answer the two questions I presented on the outset of this testing?
- What will it take to find the DMI’s limit?
I thought when I began, that if I simply saw an indication of performance deterioration while performing the other transfers, then that would indicate the DMI bus becoming saturated. Even during the testing, when I saw the results of the HDTune dirty tests, I thought I was really on to something. However, if the DMI bus was being saturated, wouldn’t the SATA2 dirty testing also show degraded performance? It also uses the DMI to communicate with the CPU? I spoke to my source with Gigabyte, and I’m still waiting for a response.
Despite this question, there definitely was a change in performance during my dirty testing. The results speak for themselves, and there did seem to be some evidence for decreased non-Turbo SSUSB performance.
- Will turning on Turbo mode actually help in a situation where the DMI is saturated
If the performance deficiency we saw was caused by the saturation of the DMI bus, then I think we can answer this question with “sometimes”. Looking back over the testing, I would say that Turbo mode usually maintained it’s performance during the dirty testing, but I cannot say with conviction that it was unaffected. Specifically, ATTO read results show some obvious read speed performance deterioration.
To truely answer the premise of my questions, I really need a competitor’s motherboard which uses a bridge chip, so that I could compare side by side these varying technologies. Until then, the best advice I can give is a repeat of what I wrote in part 1:
For those users who do not want to compromise performance for any device, they should be looking at a system based on the Nehalem (X58 & LGA1366) architecture anyway. For a budget friendly system based around P55, this [Gigabyte P55A series] should make everyone happy.
“SSUSB bug” Update
I owe one more bit of information to all of you who read my first article, and are concerned about the “SSUSB bug” I discovered in my testing. I found part of the problem. The NEC USB3.0 controller is very sensitive to PCIe speeds. The connectivity issues I had during my first round of testing were caused by a non-standard 103MHz PCIe speed. Setting my PCIe speed to the standard 100MHz resolved all my SSUSB connectivity issues. I was not however, able to confirm whether or not the odd “power cycle” issue was also resolved. While I did see some abnormal fluctuations in the transfer speeds of both ATTO read and write tests, I did not have anymore lockups like I did testing the Vertex drive in the first tests.