Intel i7-6950X Broadwell-E CPU Review

Add Your Comments

Today we are going to get a chance to review Intel”s next processor in their High-End Desktop (HEDT) platform. The king of the consumer hill used to be the mighty i7-5960X our friend Dino reviewed back in August of 2014. That “extreme” CPU shook up the landscape then by providing the consumer with a high clocked, octo-core CPU with a total of 16 threads. Since then, Intel brought to market their 14 nm Broadwell CPU architecture, on their mainstream platform, through their “tick-tock” cycle. Naturally, the next HEDT processor is based off the same 14 nm lithography, Tri-gate (FinFET) technology as Broadwell-E.

Broadwell-E, specifically the i7-6950X, brings us our first 10 core CPU which isn’t based on the Xeon or Opteron platforms. Typically this brings with it lower clocks speeds, or high pricing. Broadwell also arrived on the scene with some slight Instructions Per Clock (IPC) improvements over Haswell/Haswell-E based CPUs. Those increases likely will not go away with Broadwell-E. Take those improvements, plus the additional cores, and you have a CPU ready to do a whole lot of things at once. Be it gaming, content creation, VR, or just about anything else you can think of, you can do multiple intensive tasks on the i7-6950X without breaking a sweat. Mega-Tasking is the term Intel is using.

This type of performance does come with a price as it has with all of the HEDT processors, particularly the Extreme SKUs, and this one will not be any different. It will come in well over the $999 mark the i7-5960X held. We’ll talk more on pricing later. For now it is time to dig in a bit to check out specifications and features as well as get into the performance of the CPU. Grab your favorite beverage, sit back, relax, and enjoy!

Specifications and Features

Digging into the specifications the extreme CPU (i7-6950X) comes with a total of ten logical cores, with HyperThreading, totaling 20 threads. A 25% increase in cores and threads over the i7-5960X it is replacing. The clockspeeds come in at 3.0 GHz base with a maximum turbo (Turbo Boost 2.0, note) frequency of 3.5 GHz. Something new to the Broadwell-E family of CPUs is the addition of Turbo Boost Max Technology 3.0. I will get into details a bit later, but in a nutshell this takes the best core and boosts it up, even past the Turbo Boost 2.0 specification.

We mentioned earlier Broadwell-E also uses the 14 nm lithography and its Tri-Gate (FinFET) 3D Transistors. Across all SKUs, the TDP is unchanged from Haswell-E at 140 W. This means at the top end you are getting two more cores, for a total of four more threads, in the same power envelope. In order to dissipate the heat, Broadwell-E uses the same TIM found on Haswell-E (Solder Thermal Interface Material, STIM)… which is a good thing.

The maximum amount of memory to be allocated with this CPU/platform remains at 128 GB. As most will likely recall this is in a quad-channel configuration. One big difference here, compared with Haswell-E, is the memory standard has bumped up a bit from 2133 MHz (also the standard in Skylake currently) to 2400 MHz in stock form.

Moving along to PCIe lanes, the top three SKUs (6950X, 6900K, and 6850K) all have 40 PCIe lanes from the CPU. The i7-6800K has 28 lanes, the same as the 5820K it replaces. If you are looking for some x16/x16 action, its going to have to come from the 6850K and higher.

As a side note, you will be able to use X99 socket 2011-v3 boards with a BIOS update for these CPUs. You will need the BIOS update and the software/driver to use the new Turbo boost 3.0.

Intel i7 6950X Specifications
# of Cores 10
# of Threads 20
Clock Speed 3.0 GHz
Max Turbo Frequency (2.0) 3.5 GHz
Max Turbo Frequency (3.0) Depends on Sample (15% gains)
Instruction Set 64-bit
Instruction Set Extensions SSE 4.1/4.2, AVX 2.0
Lithography 14 nm Tri-Gate 3D Transistors
TDP 140 W
Thermal Solution Spec PCA 2013D
Memory Specifications
Max Memory Size 128 GB
Memory Types DDR4 2400
# of Memory Channels 4
ECC Memory Support Yes
Expansion Options
PCI Express Revision 3.0
PCI Express Configurations Up to 2×16 + 1×8 or 5×8
Max # of PCI Express Lanes 40 (6800K = 28 lanes)
Intel Data/Platform Protection Technology
AES New Instructions Yes
Secure Key Yes
OS Guard Yes
Trusted Execution Technology No
Execute Disable Bit Yes
Anti-Theft Technology Yes

Below is a list of all the SKUs Broadwell-E will offer us at the consumer level. Compared to Haswell-E there is an additional SKU, in this case, the 6950X. The rest match up quite nice with each other in the 3.0 GHz 5960X and 3.2 GHz 6900K (both Octo-cores), 3.6 GHz 6850K and 3.5 GHz 5930K (both high-end hex-cores with 40 PCIe lanes), and finally the 3.4 GHz 6800K which replaces the 3.3 GHz 5820K (“low end” hex-cores with 28 PCIe lanes).

Cache is up to 25 MB L3 on the 6950X (2.5 MB per core, which all the cache can be shared), while the rest of the product line matches its predecessors, 20 MB for 6900K/5960X and 15 MB for 6850K/6800K/5930K/5820K respectively.

Pricing on the new deca-core 6950X is up substantially from its octo-core predecessor coming in at a wallet emptying $1723. The MSRP on the 5960X was $999, an increase of over 70%. Seems like a pretty big jump, no? It does to me as well. The key to remember here is the CPU is made for those who can use all the cores yielding increased productivity. Mega-Tasking. When time is money this amount can be made up pretty fast. The octo-core i7-6900K now holds a $1089 price tag while the hex-core i7-6850K is $614 (a $31 increase over the 5930K), and the little hex, 6800K, is $434 (a difference of $45 over the 5820K).

pricing

(Source: Intel)

Key Features (from Intel)

Intel® Core™ i7-6950X Processor Extreme Edition Key Features:

  • 20-Way Multi-Task Processing: Runs 20 independent processor threads in one physical package
  • Massive PCI Express Bandwidth: 40 lanes of PCIe supported through the processor.
  • Intel® Turbo Boost Max Technology 3.0*: Dynamically increases single-core turbo frequency when applications demand more performance by moving the workload to the fastest core on the processor.Speed when you need it, energy efficiency when you don’t.
  • Intel® Turbo Boost Technology 2.0: Dynamically increases the processor’s frequency, as needed, by taking advantage of thermal and power headroom when operating below specified limits.
  • Intel® Hyper-Threading Technology: Allows each core of the processor to work on two tasks at the same time providing unprecedented processing capability for better multi-tasking and for threaded applications. Do more with less wait time.
  • Intel® Smart Cache: Up to 25MB of shared cached allows faster access to your data by enabling dynamic and efficient allocation of the cache to match the needs of each core significantly reducing latency to frequently used data and improving performance.
  • Overclocking Enabled 1: Fully unlocked core multipliers, power, base clock and DDR4 memory ratios for ultimate flexibility with overclocking.
  • Integrated Memory Controller: Supports 4 channels of DDR4-2400 memory with 1 DIMM per channel. Support for the Intel® eXtreme Memory Profile (Intel® XMP) specification, revision 2.0 for DDR4.

One of the more interesting new technologies on Broadwell-E is their Turbo Boost Max Technology 3.0. As Intel describes above, what this does is dynamically increase one core”s turbo frequency when applications need more performance and moves the workload to the fastest core on the CPU. In other words, this latest version of turbo will identify the fastest core on the CPU and then will pin the application to the fastest core (performed via driver/application). The frequency you achieve is above and beyond Turbo Boost 2.0. Intel would not tell us how fast it would go, however they did mention it could be up to 15% faster and to “do the math.” If my math is correct (and assuming 1:1 scaling clockspeed:performance) some processors could see speeds of 4 GHz on a single core.

This is accomplished by a driver/software layer you need to install on your system. It will not work unless your BIOS is updated and the software is installed as the OS cannot “effectively route workloads to ordered cores.” The software resides minimized in your system tray. When you want to access its configurable features, simply right click and select “Show UI” control.

Below are a few slides from Intel covering the high-level features, three new overclocking features, and Thunderbolt 3.0 technology. The first shows us just a couple of ways to utilize such a monster CPU…there is the “Mega-Tasking” word again!

The next slide is a good one for our crowd of overclockers. It reads there are three new overclocking features; Per Core Overclocking (new?), ACX Ratio Offset, and VccU Voltage control. Per Core Overclocking, as the name really describes, gives you the ability to set each core to a specific clockspeed. Nothing new there really. The AVX ratio offset is, in fact, something new. What this feature can do is distinguish between AVX instructions (which we know to use more power and can raise voltage in some stress tests and real world applications) and have the CPU keep those AVX instructions at its nominal operating frequency which preserves TDP headroom on other instructions (i.e. SSE). VccU Voltage control is the ring architecture which you have voltage control over. Other mainstream lines have this too.

Last up, the Thunderbolt 3.0 slide. Intel’s Thunderbolt 3.0 brings you Thunderbolt on USB-C adapter plugs allowing connectivity to displays, docks, or any data device. You will be able to pump 4K video, 10 GbE networking, and external graphics solutions though it. A pretty flexible protocol indeed. Perhaps running through the, becoming more common every day, USB-C sized connector they can increase market saturation. Read more about Thunderbolt 3.0 technology at the Intel Website.

This slideshow requires JavaScript.

(Image Source: Intel)

Below I have attached some more slides from the Intel press deck showing their increases with real-world applications such as Handbrake, Adobe Premier Pro CC, Blender, and Kolor Autopano Video Pro. The increases off of these applications vary widely, but do show tremendous increases in these productivity/content creation applications. See the Intel slides below for their internal testing and results:

This slideshow requires JavaScript.

Below we are looking at a picture of the die-map. Since we are on the HEDT platform, you may notice we are missing the integrated GPU from Broadwell (Iris Pro 6200) as well as the eDRAM supporting it (though in Skylake it did act as a DRAM buffer). You want into the HEDT platform, you will still need a discrete GPU.

Die Map

Die Map

Product Tour

Next, we get to see our first actual pictures of the 6950X. Being an engineering sample, we did not get it in the final retail configuration (pictured in thumbnail form below) but a simple little cardboard box. When you open up the box, the CPU is resting on some soft foam padding.

One thing you will notice in these pictures is the IHS being a different shape than the outgoing Haswell-E flagship, the 5960X.

Super Secret Intel Box!

Super Secret Intel Box!

Inside the Super Secret Intel Box!

Inside the Super Secret Intel Box!

6950X

6950X

Back

Back

boxfront boxback

In the pictures below, I compared the i7-6950X with a i7-5820K. There are no notable differences between the i7-5820K and the i7-5960X, but I do not own an i7-5960X. You can now clearly see the difference in the IHS design between the two. The 5820K on the left has two larger contact point at the top and bottom, while the 6950X has four contact points to spread the loads across the PCB.

Flipping it over doesn’t show too much really. About 2011 pads there to match up with the 2011 pins in the socket 2011-v3 on X99 boards. We can see the caps setup is different, but that’s about it!

Last up in these images are pictures of the two processors from a “head on” angle. I took this shot to show you the Broadwell-E processor has a much thinner PCB than the Haswell-E. The slightly taller IHS appears to make up the height difference.

5820K (Left) and 6950X (right)

5820K (Left) and 6950X (right)

Again..5820K (on your left), and 6950X

5820K (left) and 6950X (right) – Rear

Note the thinner PCB on the 6950X

Note the thinner PCB on the 6950X (right)

Benchmarks

The data points we have come from old data (Broadwell based i7-5775C) as well as capturing new data from the Haswell based i7-5820K (mine), an i7-5960X (courtesy of Benching Team Leader and Reviewer, Johan45), and a Skylake based i7-6700K (also mine). I was also able to capture new data points with matching speeds (4 GHz, DDR4 3000 MHz CL15-15-15-35) in order to see any instructions per clock increases in both CPU tests as well as a couple of games and benchmarks. While not perfect, the data we have gathered will give us a great idea of its performance both at stock, and matching clockspeeds to see IPC performance increases between Haswell-E and Broadwell-E.

i7-5775C i7-5820K i7-6700K i7-5960X
Motherboard ASUS Maximus VII Formula ASRock X99 OC Formula ASUS Maximus VIII Extreme GIGABYTE X99 SOC Champion
Memory G.SKILL TridentX 2X8 GB 2400 MHz Kingston HyperX 4x4GB DDR4 3000 MHz G.SKILL Trident Z 2X8 GB DDR4 3200 MHz @ 3000 MHz G.SKILL Ripjaws4 DDR4-3000 MHz 4X4 GB 15-15-15-35
HDD Samsung 840 EVO 500 GB Samsung 950 Pro 500 GB OCZ Trion 150 480 GB SSD Samsung 850 EVO mSATA 250G
Power Supply Corsair HX1050 EVGA Supernova G2 750 W Seasonic Platinum 1 kW Superflower Leadex 1 kW
Video Card EVGA GTX 780 Ti Classified GIGABYTE GTX 980 Ti Xtreme Gaming GIGABYTE GTX 980 Ti Xtreme Gaming EVGA GTX 980 Ti Classified
Cooling EK-Supreme LTX Water Block
360 mm Radiator
MCP35X Pump
EK-Supreme LTX Water Block
Swiftech MCR320 360 mm + 240 mm radiator
MCP655-B Pump

Swiftech Water Block

D5 Pump

Hyper 212 Evo
OS Windows 7 x64 Windows 10 x64 Windows 10 x64 Windows 10 x64

And the test System:

Test Setup
CPU Intel i7 6950X @ stock and 4 GHz
CPU Cooler Custom Loop with EK LTZ CPU Block, Swiftech MCP655 Vario,
Swiftech MCR320 + PA 120.2, 3x Yate Loon High @ 1K RPM
Motherboard MSI X99A Gaming Pro Carbon
RAM G.SKILL Trident Z 4x8GB DDR4-3.2K 14-16-16-35 @ DDR4 3K MHz 15-15-15-35
Graphics Card GIGABYTE GTX 980 Ti Xtreme Gaming
Hard Drive OCZ Trion 150 512GB
Power Supply Seasonic Platinum-1000
Operating System Windows 10 Pro x64 (fully updated)
Benchmarks See below
Equipment
Digital Multimeter

Both MSI and G.SKILL were kind enough to send a board and memory for the Broadwell-E review. MSI sent over their X99A Gaming Pro Carbon, while G.SKILL sent along a 4x8GB DDR4 3200 MHz CL14-14-14-24 2N kit (F4-3200C14Q2-64GTZ) to extract the most out of said system. Please see the links above for details on each product and other great products both offer.

This slideshow requires JavaScript.

And the test system (minus the GPU) all lit up!

Test System (Giga 980Ti Xtreme Gaming not pictured)

Test System (GIGABYTE 980Ti Xtreme Gaming not pictured)

Benchmarks Used

All benchmarks were run with the motherboard being set to optimized defaults (outside of some memory settings which had to be configured manually). When “stock” is mentioned along with the clockspeed, it does not reflect the boost clocks. For example, the 5960X boosts all cores to 3.3 GHz, but the 6950X boosts them to 3.5 GHz (at least this is what this motherboard set it at). In the graphs, they are listed both at 3.0 GHz, their base clock.

2D Tests

  • AIDA64 Engineer Memory, CPU, and FPU Tests
  • Cinebench R11.5 and R15
  • x265 1080p Benchmark (HWBOT)
  • POV Ray
  • Intel XTU
  • SuperPi 1M/32M
  • WPrime 32M/1024MB
  • 7Zip

All CPU tests were run at their default settings unless otherwise noted.

3D Tests (Game and Synthetic)

All game tests were run at 1920×1080 and 2560×1440.

  • 3DMark Fire Strike Extreme– Default extreme settings (runs at 2560×1440)
  • Battlefield 4 – Ultra Preset
  • Crysis 3 – 1920X1080, Very high settings, 16x AF, 8x MSAA
  • Dirt: Rally – Ultra Preset + Advanced smoothing enabled
  • Ashes of the Singularity – “Crazy” Preset

Below is our first taste of benchmarks. We start out with our trusty CPU and Memory testing application, AIDA64. Here we look at the individual CPU tests and see how these processors perform in a stock vs. stock configuration. This allows for turbo to kick in, both Turbo Boost 2.0 and Turbo Boost 3.0, for single threaded applications. In the case of the 6950X we mentioned it will turbo all cores to 3.5 GHz (again, the motherboard sets it to 3.4 GHz to start) while the 5960X boosts all cores to 3.3 GHz. In everything outside of PhotoWorxx we see some pretty impressive gains here ranging from 23% down to 18%.

According to AIDA64 notes, PhotoWorxx “cannot effectively scale in situations where more than 2 processing threads used. For example, on a 8-way Pentium III Xeon system the 8 processing threads will be ‘fighting’ over the memory, creating a serious bottleneck that would lead to as low scores as a 2-way or 4-way similar processor based system could achieve”. We are seeing scaling past eight threads, but even on these quad-channel platforms, it seems like the fighting they are talking about still manages to create a bottleneck.

AIDA64 CPU Tests - Stock

AIDA64 CPU Tests – Stock

AIDA64 CPU Benchmarks – Raw Data
CPU Queen Ph Worxx Zlib AES Hash
i7 5775C 45880 23655 326.8 16365 3918
i7 6700K @ 4 GHz 48776 25823 358.2 18248 4594
i7 5820K 59395 30319 434.1 22974 5179
i7 5960X 78823 30727 596.3 31802 7167
i7 6950X 96796 30694 752.4 38615 9230

Moving on to the Floating Point Unit (FPU) tests, we are seeing solid scaling out of SinJulia at nearly 20% over the 5960X however, with Mandel and Julia, we are not seeing scaling past eight cores it seems with a mere 5% between them. VP8 is a video compression benchmark which uses the VP8 video codec. Here is seems like clockspeed is king even though it says it is core/thread aware.

AIDA64 FPU Tests - Stock

AIDA64 FPU Tests – Stock

AIDA64 FPU Benchmarks – Raw Data
CPU VP8 Julia Mandel SinJulia
i7 5775C 5873 24513 13184 4625
i7 6700K @ 4 GHz 7532 34048 18322 4830
i7 5820K 6674 39957 21475 4974
i7 5960X 6414 55518 29730 6448
i7 6950X 6995 58484 31042 10908

AIDA64 memory shows the 6950X with the most bandwidth out of all the compared CPUs. This is in part due to its quad-channel architecture as well as the AIDA64 memory benchmark being multi-threaded. The more threads you have, the more bandwidth you can get. I would expect things are just the same as far as real world speeds are concerned.

AIDA 64 Memory Test - Stock

AIDA 64 Memory Test – Stock

AIDA64 Memory Benchmarks – Raw Data
CPU Read Write Copy Latency
i7 5775C 37589 37376 42403 53.1
i7 6700K @ 4 GHz 43542 44661 39900 43.4
i7 5820K 51596 46939 58553 59.9
i7 5960X 58303 46900 34776 64.0
i7 6950X 62180 64534 69711 57.2

Next for our other multi-threaded benchmarks (compression and rendering), we see pretty typical scaling at stock speeds between all the processors. The deca-core 6950X bests everyone in the roundup as we would expect. Over the octo-core 5960X which boosts to 3.3 GHz versus the 3.5 GHz of the 6950X shows an average of 20% increase in these benchmarks.

7Zip, x265(HWBOT), POVRay, Cinebench R11.5/R15

7Zip, x265(HWBOT), POVRay, Cinebench R11.5/R15

Cinebench, 7zip, POVRay and x265 Benchmarks – Raw Data
CPU 7Zip
CB R11.5 CB R15
POVRay x265 (HWBOT)
i7 5775C 23517 8.39 774 1560.85
i7 6700K @ 4 GHz 25812 9.79 893 1900.37 21.98
i7 5820K 30617 11.00 1012 2082.87 22.42
i7 5960X 42473 15.26 1410 2845.74
i7 6950X 51276 19.26 1791 3569.40 35.17

Next are some benchmarks for our HWBOT/overclocking crowd. This graph shows multi-threaded performance with XTU and WPrime and single threaded performance in SuperPi. In the multi-threaded benchmarks, we are seeing a 20-25% increase we have come to expect. While in the single threaded benchmarks, the value drops to around 5-10%. There is an increase in IPC performance from Haswell-E to Broadwell-E. That coupled with the new Turbo Boost 3.0 is why we see a larger increase versus what you will see later in the clock for clock performance.

Intel XTU, WPrime 32M/1024M, SuperPi 1M/32M

Intel XTU, WPrime 32M/1024M, SuperPi 1M/32M

Intel XTU, SuperPi, and wPrime Benchmarks – Raw Data
CPU Intel XTU
wPrime 1024M
wPrime 32M
SuperPi 32M
SuperPi 1M
i7 5775C 996 183.221 5.899 520.370 10.374
i7 6700K @ 4 GHz 1296 158.924 5.111 471.972 9.141
i7 5820K 1351 142.087 4.763 541.953 10.883
i7 5960X 1742 103.647 3.525 536.894 10.359
i7 6950X 2204 77.42 2.894 509.764 9.517

Game Results

Below I have compiled some results across a few popular games including Battlefield 4 (BF4), Dirt: Rally, Crysis 3, and Ashes of the Singularity. This was done with a 6700K, 5820K, and 6950X, all at 4 GHz with the memory at DDR4 3000 MHz CL15-15-15-35-2N timings. In some titles, you are seeing a few FPS increase as the core count goes up, but truly it isn’t much, particularly between the 5820K and 6950X. We know most games just simply don’t need more than four cores to get optimal performance as we are seeing here.

One special thing to note however is with Ashes of the Singularity. In this title, while the FPS didn’t really increase, the benchmark does mention the limitation of the test system. When I increased core count the CPU dependency dropped compared to a CPU with less cores. With DirectX 12 (DX12) breaking on to the scene, in name, its only a matter of time before more DX12 titles come out which will use more cores. This will likely be a good couple of years, so those with quad cores don’t have to sweat for a while to come.

This slideshow requires JavaScript.

Head to Head Results

In this last set of slides, I ran all the CPU benchmarks at 4 GHz with DDR4 3000 MHz CL15-15-15-25-2N memory. The point here was to show off the enhancement more cores and threads give on the multi-threaded benchmarks, and to show off any IPC improvements/shortcomings over the previous generation of CPUs.

Looking at the AIDA tests, we can see significant improvements across the board as we would expect. We are not seeing a 1:1 change here of 25% (25% more cores and threads compared with 5960X) with these specific tests. We can see where clockspeed rules in VP8 and the memory constraints in PhotoWorxx due to lack of core scaling.

With our HWBOT multi-threaded suite, consisting of 7Zip, x265 (HWBOT), POVRay, and Cinebench R11.5/R15, we start to see a lot more consistent scaling here. Across all of these benchmarks, we are seeing almost a 23% increase from just the number of cores and some IPC improvements over Haswell-E. The same goes for the next slide in Intel XTU and WPrime. In those two benchmarks we are again seeing a nearly 23% increase.

The single-threaded benchmarks SuperPi 1M and 32M were included to show, as best we can with different systems, an IPC difference. In this case, we didn’t happen to see anything measurable for all intents and purposes. But again it was on different systems and memory (though primary timings are the same, secondary/tertiary are not). Where you can really see a difference though is at stock speeds and the performance increase Turbo Boost 3.0 offers going above and beyond Turbo Boost 2.0. Intel claims up to a 15% difference, which would put a single core at up to 4 GHz, when compared to the previous generation of Turbo Boost. Be aware, however, your mileage may vary here as each CPU and its cores are different. Where I may boost to 4 GHz, yours may not reach as high or maybe it will reach higher?

This slideshow requires JavaScript.

Overclocking

Onto the most exciting part of the review… well, at least for me, the overclocking! We are going to take the MSI X99A Gaming Pro Carbon and lean on the 10 core/20 thread monster a bit and see where we come out for daily driving. Since the board won’t be the limiting factor, the only thing getting in our way are temps and voltage. We have already seen 4 GHz in the testing above. So, to make this section worth it, we have to be above that, right? We are. I ended up at 4.2 GHz/1.20 V. It was able to run some pretty heavy benchmarks in Cinebench R15, WPrime 32M/1024M, and Hyper Pi 1M do I moved on. After I knocked out those benchmarks, I put it under a stress test in AIDA64‘s System Stability Test (default) and let it rip. It was stable enough there to run at least those 30 minutes (ran for over an hour actually as I went back and forth from watching the Cleveland Cavaliers beat the Toronto Raptors and writing this article) as well as get in some Battlefield 4 over an hour. Seemed stable enough!

4.2 GHz, 3200 MHz Memory 1.20V

4.2 GHz, 3200 MHz Memory 1.20V

Power Consumption and Temperatures

Trying to cool a 140 W processor isn’t a terribly difficult thing to do, but is certainly something to consider when overclocking. Power consumption can tip the scales at 300 W when only stress testing the CPU (overclocked). Some high-end air cooling, a 2×120 mm AIO, or even a custom loop are good, better, best; as always. Our test system uses a full custom loop sporting 5×120 mm worth of radiator area just for the CPU. There are a total of five 120 mm Yate Loon High (D12SH-12) mounted on those radiators running at 1000 RPM (practically silent). For suggestions, check out our roundup of coolers for the Broadwell-E platform.

Each test was run for one hour to allow for the loop to normalize. Between runs, the OC was left on idle for 30 minutes allowing its temperatures to return to idle. Room temperature was taken at the beginning of each test. Any differences have been included in the end result.

Since the loop is so big, I’m not going to mention the stock temperatures… it was cool though as you can see in the graph. Stepping the clockspeed up to 4 GHz and the voltage to 1.15 V (actual load voltage) still yielded some pretty solid peak temps. In all the testing I topped out at 63 °C running Prime95 (v28.5) torture test, the small FFT. AIDA64 peaked at 58 °C during its most stressful test, FPU (only check off FPU test for the most stressful test) while the coolest running test was AIDA”s default with Stress CPU, FPU, cache, and system memory are selected. In this circumstance it hit 50 °C. Obviously temperatures will be higher with a smaller loop. So it seems the thermals are in order. Being the HEDT platform and like Haswell-E, it uses STIM to move heat away from the die and to the IHS.

graph-temps

Temperatures

Moving on to power consumption below, we can see the 140 W chip doesn’t really sip on power, though it isn’t a preposterous number either particularly when you stop to consider there are 10 cores and 20 threads in total on this monster CPU. While sitting idle with all power savings enabled (loaded BIOS defaults), I sat at 83/85 W stock and overclocked respectively. To get stress test loads, I used the two programs I typically use to see if my system is stable, AIDA64 and Prime95. In the “default” AIDA64 tests, which stress CPU, FPU, Cache, and Memory at the same time, I peaked around 165 W at the wall stock and 230 W while at 4 GHz. If you only run the FPU test, which puts a lot more stress on the CPU itself as it uses the AVX/FMA instructions sets, came in nearly 25% higher at 205 W stock and 265 W overclocked. Prime95, v28.5, takes it to another level and uses even more power coming in at 227 W stock and 300 W overclocked (system). Last, but not least, I played BF4 for a while with my GIGABYTE GTX 980 Ti Xtreme Gaming card. Total system power consumption was 401 W stock and 476 W while the CPU was overclocked. Not too bad for the total amount of cores and threads this CPU has on board!

Power Consumption

Power Consumption

Conclusion

There you have it folks, Intel’s 10 core, 20 thread, Mega-Tasking monstrosity of high-end chip. Overall we have seen improvements in both multi-thread applications/benchmarks, as well as improvements in single-threaded applications/benchmarks (though, admittedly, not so much here in an IPC capacity). Due to variations across the test system, it was difficult to pin down exactly how much IPC was gained. We did see good scaling with the additional cores and threads as one would expect with gains approaching 30% in both Cinebench benchmarks, while averaging about 20%+ increase. Intel’s new Turbo Boost Max Technology 3.0 works a treat boosting well past the typical Turbo Boost 2.0 specifications for improvements up to 15% in single-threaded applications. Overall you have the tick-tock improvement as well as an increase in cores. This chip is all about content creation, rendering, and even gaming, Twitch, and encoding… at the same time.

We were able to overclock to a “stable” 4.2 GHz on this sample resulting in a 40% increase from stock or 20% increase over the maximum turbo boost for all cores (3.5 GHz). Temperatures were kept in order by the large loop and STIM we are accustomed to. I don’t imagine this kind of overclock to be difficult to achieve on a high performing 2×120 mm loop. Will all samples get there? Who knows, but this one did! My sample CPU will get some LN2 treatment as soon as I get settled in from my move here in a couple of weeks. Stay tuned!

What isn’t there to love? Well, for most who won’t use this chip for all of those things professionally (getting paid), the price. Coming in at $1723 it is, by far, the most expensive consumer level chip I have ever seen. Sure, there are Xeons more expensive, but this is the HEDT platform, where the previous record was $999. This is literally a 70% increase in price. To be frank, this chip isn’t really made for most people. Most people would be be plenty happy with a much more affordable hex-core CPU in the Broadwell-E lineup or even the quad core with HyperThreading on the latest mainstream CPU, Skylake and the Z170 chipset, and saving hundreds. If you are into the productivity, and time is money for you, paying out the premium for the high-end 6950X could be a viable choice with how much faster it plows through many popular applications such as Handbrake, Adobe Premier Pro CC, or Blender. I have to say on a personal level, I think the pricing would be more palatable for more people if it sank down to around $1499…or less. A man can dream, right?!

Wrapping things, up, Intel really brought to the table a monster of a multi-threaded, Mega-Tasking CPU with the 6950X and the entire Broadwell-E lineup. The socket 2011-v3 motherboards get an updated BIOS on their existing boards, to support the new CPUs, as well as some new/refresh boards on the same X99 chipset. What is out on the market now will work best after the BIOS flash. Keep an eye out on the front page for reviews of EVGA, GIGABYTE, and ASUS motherboards! Outside of price, there really isn’t much to complain about. Single-threaded performance is going to be notably better in many cases due to the new Turbo Boost Max Technology 3.0, in addition to the IPC increases, while the sheer amount of cores and threads in the top of the line 6950X make it a productivity monster, which will knock some serious time off of projects. I would look lower down the ladder if you are not professional, but for those who can use the power inside this CPU, there isn’t much better out there.

What does this mean? Click to find out!

What does this mean? Click to find out!

Joe Shields (Earthdog)

Leave a Reply

Your email address will not be published. Required fields are marked *

Discussion
  1. Thanks all! It really is an absolute beast assuming you can use all of its cores.
    As soon as I get settled and have more time with the chip, I will get an overclocking guide out as well as take it cold. I had some issues overclocking much past that 4.2 GHz you saw in the article. I reached 4.5 GHz for a screenshot, but was sitting around 1.4V which made me a bit nervous. It was brute force. I have another board I just swapped out very late last night (this morning before I published) and hoping maybe it will get me over the hump.
    I have read from another review (anand) they were at 4.1 GHz and 1.31V so it seems like I have a slightly better than average CPU (or at least better than their sample) to drive... not sure what the issue was... (looks in the mirror).
    This chip has its small niche, I mean mega-tasking, and dominates in that niche. I agree with the assessment that the CPU is only worth the money to those who use heavily multi-threaded applications for their livelihood.
    mdcomp
    /sells kidney for $1700

    Don't get ripped off, do some research first :p
    Hi everyone ,
    Hello ,
    we all know that games dont use more than 8 threads today ...
    so to take advantage of an 8 cores or 10 cores CPU in Gaming you should Disable HT (Hyperthreading) and run the gaming test again to compare it against the 4 cores i7 6700K .
    and test it with SLI as well to reach the i7 6700k bottleneck !
    let me put it more simple ,
    The i7 6700K has 4 cores and can oc to 4.4 ghz easy . this CPU will give us 8 Virtual cores comparable to 2.2 GHZ clock for each virtual core .
    However the 8 Coresi7 6900K , With the HT Turned OFF , will give us 8 cores @ 4.4 ghz EACH !
    Thats double the speed of the 4 cores i7 ! if the game uses 8 threads .
    EVEN if we dont OC the 8 cores , it would be 3.2GHZ VS 2.2 GHZ !!!
    if you ask why Disable HT ? simple because the game will never use 16 Virtual cores !!! and the advantage is LOST .
    Please run the test again for games with HT turned off in the 8 cores and 10 cores cpu .
    and to stress the CPU more , TEST SLI as well , we want the i7 6700K to bottleneck !
    THANKS
    oh and Intel Should release i5 Broadwel-E CPU , 8 cores without HT , CHEAPER and BETTER for GAMERS
    samer1970
    Hi everyone ,
    Hello ,
    we all know that games dont use more than 8 threads today ...
    so to take advantage of an 8 cores or 10 cores CPU in Gaming you should Disable HT (Hyperthreading) and run the gaming test again to compare it against the 4 cores i7 6700K .
    and test it with SLI as well to reach the i7 6700k bottleneck !

    Games don't even make use 8 threads very well since there's almost no difference in the i5 6500K (4c/4t) and i7 6700K (4c/8t). Are you wanting to see 4 real cores vs 8 real cores vs 10 real cores?
    Testing 2-way, 3-way, and 4-way SLI with the i7 6700K and i7 6950X at the same clocks with or without HT enabled on both CPUs would show if, when, and how much 4c/8t or 4c/4t is a bottleneck in different SLI configurations. This would be interesting just to see, and might be useful to a very few people out there.
    samer1970
    let me put it more simple ,
    The i7 6700K has 4 cores and can oc to 4.4 ghz easy . this CPU will give us 8 Virtual cores comparable to 2.2 GHZ clock for each virtual core .
    However the 8 Coresi7 6900K , With the HT Turned OFF , will give us 8 cores @ 4.4 ghz EACH !
    Thats double the speed of the 4 cores i7 ! if the game uses 8 threads .
    EVEN if we dont OC the 8 cores , it would be 3.2GHZ VS 2.2 GHZ !!!

    Assuming your 4.4GHz OC, that means the 6700K has four cores each running at 4.4 GHz and each one of those 4.4 GHz cores can process two threads at once with HT enabled. When HT is turned off, each one of those 4.4 GHz cores will only process one thread at a time. No dividing speed by threads to get a "speed per thread", it doesn't work like that. Also, we can't assume the 6900K will OC like an 6700K, although, it may very well make 4.4 GHz.
    samer1970
    if you ask why Disable HT ? simple because the game will never use 16 Virtual cores !!! and the advantage is LOST .
    Please run the test again for games with HT turned off in the 8 cores and 10 cores cpu .
    and to stress the CPU more , TEST SLI as well , we want the i7 6700K to bottleneck !
    THANKS
    oh and Intel Should release i5 Broadwel-E CPU , 8 cores without HT , CHEAPER and BETTER for GAMERS

    I can agree that stressing the CPU more with multi-GPU setups would be an interesting read, but I don't see what you're going for with the CPU and HT testing unless you want 4 vs 8 vs 10 real cores (see if games take advantage of more real cores better than cores+threads). I mean, we know that there's slight to no IPC improvement between 6700K and 6950X, so a game that only takes advantage of 4c/8t will only take advantage of 4c/8t in the 6950X as well. Therefore, since there is little to no IPC improvement between 6700K and 6950X, there would be little to no performance increase between 6700K and 6950X in that game (results in the review confirm this). However, with the 6950X you would have 6c/12t free and unused by that game, which means more things could be done while gaming simultaneously without slowing down the game performance. The 6950X is about heavy multi-tasking and heavily multi-threaded applications. Games are only going to use ~40% of the 6950X's resources, the game (software) is the limiting factor.
    MattNo5ss
    ...
    Testing 2-way, 3-way, and 4-way SLI with the i7 6700K and i7 6950X at the same clocks with or without HT enabled on both CPUs would show if, when, and how much 4c/8t or 4c/4t is a bottleneck in different SLI configurations. This would be interesting just to see, and might be useful to a very few people out there.
    ...

    There will never be a case of bottle neck . There are more than enough lanes for these CPUs, and the mobo makers should know how to architect with this availability.
    The PCIe lanes are not really the question though. Its the fact that with most high end cards in SLI/CFx that the faster your CPU speed is, the better your FPS are. In those situations, I do not believe that threading has anything to do with it. We saw similar scaling for a quad as you did with quad + HT, as you did with a hex + HT (not sure about a dual core).
    Alright so I don't have time to go through a complete tear down (I'm typing these up in meetings).
    Here is a quick glimpse at what can cause a common bottleneck in desktop mobos: An Intel CPU has a bunch of available lanes (a lot more than what is utilized). Each of these lanes are grouped into channels (since PCI-E is point to point). Each channel has a window to receive and transmit data. So if the channels are distributed correctly, and the signals on the channels are laid out correctly (think physical link), than the n-number of devices connected to these channels will have enough time to move their data to and from the CPU. What happens when you don't have these basics designed correctly (or gimped due to costs), than you will see bottlenecks in different tests.
    A great example is multi-gpu with m.2. The GPUs typically will receive the best designed channels so that they never have issues. Yet an m.2 that is now just entering the market will not. Why gimp these channels? Simple answer: cost. More lanes = more board layers. PCI-2.x, 3.0 and future will be harder and harder to implement. PCI-E has a great recovery system, but that creates high overhead (who cares with PC). Faster cores helps mask this problem. Higher PCI-E bus somewhat helps, but ruins the channel signal integrity.
    To add: Think channels as highways. With an X amount of cars present, the more lanes you have, the smaller your window. The smaller number of lanes increase the window. PCI-E uses windows for data transfer. Only 1 window at a time for communication.
    I'm hoping to buy/bin 2-3 of these, keep the best, then sell (or return) the reject(s). No 'real world' advantage but dang I want to bench 2D and newer 3D with this beast.
    Gotta wait for some cold results and more reviews and guides before even thinking about pulling the trigger on one of these, though.
    Dolk
    There will never be a case of bottle neck . There are more than enough lanes for these CPUs, and the mobo makers should know how to architect with this availability.

    That statement was about processing power of the CPU as number of GPUs increase, and when are the 4 cores of the 6700K not able to crunch enough data for all the GPUs, not really number of PCIe lanes available. It would help answer the question: How many GPUs can be added before I need a CPU upgrade? You know, a Pentium G3258 has 16 PCIe lanes just like a 4700K has 16 PCIe lanes, but a 2c/2t Pentium will surely bottleneck a double/triple GPU setup.
    The available PCIe lanes are 16 vs 40 (6700K vs 6950X), so the 6950X could do a little better in configs with more than 2 GPUs since motherboards won't need a PLX chip.
    Dolk
    Alright so I don't have time to go through a complete tear down (I'm typing these up in meetings).
    Here is a quick glimpse at what can cause a common bottleneck in desktop mobos: An Intel CPU has a bunch of available lanes (a lot more than what is utilized). Each of these lanes are grouped into channels (since PCI-E is point to point). Each channel has a window to receive and transmit data. So if the channels are distributed correctly, and the signals on the channels are laid out correctly (think physical link), than the n-number of devices connected to these channels will have enough time to move their data to and from the CPU. What happens when you don't have these basics designed correctly (or gimped due to costs), than you will see bottlenecks in different tests.
    A great example is multi-gpu with m.2. The GPUs typically will receive the best designed channels so that they never have issues. Yet an m.2 that is now just entering the market will not. Why gimp these channels? Simple answer: cost. More lanes = more board layers. PCI-2.x, 3.0 and future will be harder and harder to implement. PCI-E has a great recovery system, but that creates high overhead (who cares with PC). Faster cores helps mask this problem. Higher PCI-E bus somewhat helps, but ruins the channel signal integrity.
    To add: Think channels as highways. With an X amount of cars present, the more lanes you have, the smaller your window. The smaller number of lanes increase the window. PCI-E uses windows for data transfer. Only 1 window at a time for communication.

    Still not sure why we are talking about PCIe lanes...