AMD Ryzen 5 2400G and Ryzen 3 2200G APU Review

Add Your Comments

AMD has had much success this past year with their fully redesigned Zen CPU core. First, they gave us Ryzen and our first look at AMD’s building block technique which uses CPU Complexes (CCX), the cornerstone of their design, and the Infinity Fabric which ties these “blocks” together. This approach allows AMD to easily “stack” these four core, eight thread CCX together to increase core count. Take the Threadripper CPU with as many as 16 cores and 32 threads or as in this case, attach the CCX to a Radeon Vega graphics core and renew their APU lineup.

Today I have the AMD Ryzen 5 2400G and the Ryzen 3 2200G on the test bench. These are AMD’s all-new, revised APUs based on the Zen Architecture and Radeon Graphics processing code-named Raven Ridge. AMD has found through independent research that PCs sold without a discrete graphics card makeup 30% of the market and the addition of an APU of the Ryzen Family would be ideally suited to this segment.

With suggested pricing of $169.00 for the Ryzen 5 2400G and $99.00 for the Ryzen 3 2200G, AMD has set a compelling price point that does not require a dedicated GPU. In AMD’s testing, the Ryzen 5 2400G APU often compares favorably to $75 dedicated GPUs making this APU a wise choice when it comes to performance per dollar for price-conscious consumers. These two APUs will ultimately replace the Ryzen 5 1400 and Ryzen 3 1200 with similar or lower suggested pricing, higher base and boost clocks, and integrated graphics It’s a natural progression.

Specifications and Features

Looking at the specifications table below, the 2400G is a quad-core with SMT for a total of eight threads. This total core/thread count comes from the use of a single CPU Complex (CCX) with SMT active. The base clock comes in at 3.6 GHz and will boost to 3.9 GHz on AMD’s improved Precision Boost 2 technology (more on this later). The 2400G APU also includes 11 Radeon Vega Compute Units clocked up to 1250 MHz. The 2200G is also built on one CCX for four cores but SMT isn’t active on this SKU. The base clock comes in at 3.5 GHz and will boost to 3.7 GHz with Precision Boost 2 technology and also includes eight Radeon Vega Compute Units clocked up to 1100 MHz.

Both are produced on AMD’s 14 nm FinFET process with a TDP (Thermal Design Power) of 65 W. The cooling medium between the die and IHS is TIM (Thermal Interface Material) instead of solder, AMD chose this method to keep production costs down and keep pricing competitive.

There are benefits to using a single CCX such as a lower cost and a more compact size which makes it more suitable for desktop as well as mobile solutions. This also leads to improved latency over a two CCX CPU but there are some drawbacks. This move reduces the L3 cache from 8MB to 4 MB which AMD has offset with higher CPU clocks. The new CPU package also allows Raven Ridge to officially support JEDEC DDR4-2933, the highest official memory clock of any consumer processor.

Regarding PCI Express (PCIe) support, Raven Ridge offers a total of 24 lanes out of the CPU with eight dedicated to graphics and eight for general use such as M.2 PCIe NVMe (four of those eight dedicated to the chipset). The remainder is split up over SATA and USB 2.0, 3.1 and 3.1 Gen2 functionality. AMD’s decision to reduce the dedicated graphics PCIe lanes from sixteen to eight is based on the mid-range GPU and workloads likely to be paired with the APU. The upside, it is simpler to manufacture allowing AMD to reduce consumer costs.

Windows 10 is the officially supported platform for the Ryzen APUs. At this point, it’s unclear whether or not any legacy Operating Systems such as Windows 7 will be supported.

APU AMD Ryzen 5 2400G AMD Ryzen 3 2200G
# of Cores 4 4
# of Threads 8 4
Base Clock Speed 3.6 GHz 3.5 GHz
Boost Clock Speed 3.9 GHz 3.7 GHz
Instruction Set 64-bit 64-bit
Instruction Set Extensions SSE 4.1/4.2/4a, AVX2, SHA SSE 4.1/4.2/4a, AVX2, SHA
Lithography 14 nm FinFET 14 nm FinFET
Transistor Count 4.94 billion 4.94 billion
TDP 65 W 65 W
Thermal Solution Spec Traditional nonmetallic TIM Traditional nonmetallic TIM
Integrated Graphics 11 Radeon Vega CUs     Up to 1250 MHz 8 Radeon Vega CUs                   Up to 1100 MHz
L1 Cache 64 KB I-Cache
32 KB D-Cache per Core
64 KB I-Cache
32 KB D-Cache per Core
L2 Cache 2 MB (512 KB per core) 2 MB (512 KB per core)
L3 Cache 4 MB Shared 4 MB Shared
Memory Specifications
Max Memory Size 128 GB 128 GB
Memory Types DDR4-22933 DDR4-22933
# of Memory Channels 2 2
ECC Memory Support No No

The table below is a list of the Ryzen APU desktop lineup equipped with AMD’s new Radeon Vega graphics. In it, we see the Ryzen 5 2400G is the top with its four-core, eight thread, configuration and 11 Radeon Vega compute units followed by the Ryzen 3 2200G with four cores, four threads, and eight Radeon Vega compute units. Both CPUs are overclockable, assuming you buy a motherboard with a chipset capable of doing so.

AMD Ryzen APU Model Cores/
Threads
Base Clock Boost Clock L3 Cache Cooler Included Graphics TDP
Ryzen 5 2400G 4/8 3.6 GHz 3.9 GHz 4 MB Wriath Spire 11CU  1250 MHz 65W
Ryzen 3 2200G 8/16 3.5 GHz 3.7 GHz 4 MB Wriath Spire 8CU    1100 MHz 65W


AMD SenseMI Technology

The following information was provided by AMD.

First and foremost, it is important to understand that each AMD Ryzen processor has a distributed “smart grid” of interconnected sensors that are accurate to 1 mA, 1 mV, 1 mW, and 1 °C with a polling rate of 1000/sec. These sensors generate vital telemetry data that feed into the Infinity Fabric control loop, and the control loop is empowered to make real-time adjustments to AMD Ryzen processor’s behavior based on current and expected future operating conditions.

AMD SenseMI is a package of five related “senses” that rely on sophisticated learning algorithms and/or the command-and-control functionality of the Infinity Fabric to empower AMD Ryzen processors with Machine Intelligence (MI). This intelligence is used to fine-tune the performance and power characteristics of the cores, manage speculative cache fetches, and perform AI-based branch prediction.

  • Pure Power
    The distributed network of smart sensors that drive Precision Boost can do double duty to streamline processor power consumption with any given workload. And for next-level brilliance: telemetry data from the Pure Power optimization loop allows each AMD Ryzen processor to inspect the unique characteristics of its own silicon to extract individualized power management.

  • Precision Boost 2
    After the unveiling of Precision Boost and the AMD Ryzen desktop processor, AMD has observed scenarios where 3+ cores are in use, yet the overall size of the workload is relatively small. This creates a scenario where the “all core boost” state is triggered, even though there is no imminent electrical, thermal, or utilization boundary that would practically halt further clock speed increases. This scenario represents additional opportunity to drive higher performance. The thermal, electrical, and utilization headroom of the product can be converted into higher clock speeds to capitalize on the opportunity Precision Boost 2 carries forward the 25 MHz granularity of its predecessor, but importantly transitions to an algorithm that will intelligently pursue the highest possible frequency until an aforementioned limit is encountered, or the rated frequency of the part is met (whichever comes first). This applies to any number of threads in flight, without arbitrary limits. Precision Boost 2 could be described as opportunistic, linear, or graceful, and a conceptual comparison of Precision Boost 1 VS 2 has been plotted for clarity below.

  • If a hardware limit is encountered, Precision Boost 2 is designed to level off and employ its granular clock selection to dither at a small range of frequencies circa the leveling off point. This process is a continuous adjustment loop managed by the AMD Infinity Fabric, and it cycles up to 1000 times per second. A real-world example of this is shown below with OCCT, where the boost gracefully transitions across one to eight threads and then maintains a max-thread clock speed well above the base. Taken as a whole the Precision Boost 2 invests the AMD Ryzen Processor with Radeon Vega Graphics with greater performance in real-world multi-threaded applications by freeing the CPU to make the most performant clock selection for it’s defined electrical/thermal/load/frequency capacity — regardless of the number of threads in flight.

  • Neural Net Prediction
    A true AI inside every AMD Ryzen processor harnesses a neural network to do real-time learning of an application’s behavior and speculate on its next moves. The predictive AI readies vital CPU instructions so the processor is always primed to tackle a new workload.

  • Smart Prefetch
    Sophisticated learning algorithms understand the internal patterns and behaviors of applications and anticipate what data will be needed for fast execution in the future. Smart Prefetch predicatively pre-loads that data into large caches on the AMD Ryzen processor to enable fast and responsive computing.

SMT (Simultaneous Multi-Threading)

This is AMD’s new equivalent to Intel’s HyperThreading (HT) technology. It allows each core to function as two threads, adding performance in multi-threaded applications.

Every Processor is Unlocked
AMD is allowing overclocking on all CPU models, much as they have in the past. The only caveat this time around is you must have a motherboard with a chipset supporting overclocking (X370, B350, or X300).

The “Zen” X86 Microarchitecture

  • Performance
    On the performance side, the Zen micro-architecture represents a quantum leap in core execution capability versus AMD’s previous desktop designs. Notably, the Zen architecture features a 1.75x larger instruction scheduler window and 1.5x greater issue width and resources; this change allows Zen to schedule and send more work into the execution units. Further, a micro-op cache allows Zen to bypass L2 and L3 cache when utilizing frequently-accessed micro-operations. Zen also gains a neural network-based branch prediction unit which allows the Zen architecture to be more intelligent about preparing optimal instructions and pathways for future work. Finally, products based on the Zen architecture may optionally utilize SMT to increase utilization of the compute pipeline by filling app-created pipeline bubbles with meaningful work.

  • Throughput
    A high-performance engine requires fuel, and the Zen architecture’s throughput characteristics deliver in this regard. Chief amongst the changes are major revisions to cache hierarchy with dedicated 64 KB L1 instruction and data caches, 512KB dedicated L2 cache per core, and 8 MB of L3 cache shared across four cores. This cache is augmented with a sophisticated learning prefetcher that speculatively harvests application data into the caches so they are available for immediate execution. Altogether, these changes establish lower level cache nearer to the core netting up to 5x greater cache bandwidth into a core.

  • Efficiency
    Beyond adopting the more power efficient 14 nm FinFET process, the Zen architecture specifically utilizes the density-optimized version of the Global Foundries 14 nm FinFET process. This permits for smaller die sizes and lower operating voltages across the complete power/performance curve. The Zen architecture also incorporates AMD’s latest low power design methodologies, such as: the previously mentioned micro-op cache to reduce power-intensive faraway fetches, aggressive clock gating to zero out dynamic power consumption in minimally utilized regions of the core, and a stack engine for low-power address generation into the dispatcher.
    It is in this realm, especially, that the power management wisdom of AMD’s APU teams shines through to impart in Zen the ability to scale from low-wattage mobile to HEDT configurations.

  • Scalability
    Scalability in the Zen architecture starts with the CPU Complex (CCX), a natively four core eight thread module. Each CCX has 64 KB L1 I-Cache, 64 KB L1 D-Cache, 512 KB dedicated L2 cache per core, and 8 MB L3 cache shared across cores. Each core within the CCX may optionally feature SMT for additional multi-threaded capabilities.
    More than one CCX can be present in a Zen-based product, wherein the AMD Ryzen processor features two CCX’s consisting of eight cores and 16 threads (total). Individual cores within the CCX may be disabled by AMD, and the CCX’s communicate across the high-speed Infinity Fabric. This modular design allows AMD to scale core, thread, and cache quantities as necessary to target the full spectrum of the client, server, and HPC markets.

  • Infinity Fabric
    The Infinity Fabric, meanwhile, is a flexible and coherent interface/bus that allows AMD to quickly and efficiently integrate a sophisticated IP portfolio into a cohesive die. These assembled pieces can utilize the Infinity Fabric to exchange data between CCX’s, system memory, and other controllers (e.g. memory, I/O, PCIe) present on the AMD Ryzen SoC design. The Infinity Fabric also gives Zen architecture powerful command and control capabilities, establishing a sensitive feedback loop that allows for real-time estimations and adjustments to core voltage, temperature, socket power draw, clock speed, and more. This command and control functionality is instrumental to AMD SenseMI technology.

The Vega Graphics Architecture

Seventeen years since the introduction of the first Radeon, the usage model for graphics processors continues to expand, both within the realm of visual computing and beyond. AMD’s customers are employing GPUs to tackle a diverse set of workloads spanning from machine learning to professional visualization and virtualized hosting–and into new fields like virtual reality. Even traditional gaming constantly pushes the envelope with cutting-edge visual effects and unprecedented levels of visual fidelity in the latest games. Along the way, the data sets to be processed in these applications have mushroomed in size and complexity. The processing power of GPUs has multiplied to keep pace with the needs of emerging workloads, but the throughput of nearly all types of high-performance processors has been increasingly gated by power consumption.

With these needs in mind, the Radeon Technologies Group set out to build a new architecture known as Vega. Vega is the most sweeping change to AMD’s core graphics technology since the introduction of the first GCN-based chips five years ago. The Vega architecture is intended to meet today’s needs by embracing several principles; flexible operation, support for large data sets, improved power efficiency, and extremely scalable performance, Vega introduces a host of innovative features in pursuit of this vision, which we’ll describe in the following pages. This new architecture promises to revolutionize the way GPUs are used in both established and emerging markets by offering developers new levels of control, flexibility, and scalability.

Next Generation Geometry

To meet the needs of both professional graphics and gaming applications, the geometry engines in Vega have been tuned for higher polygon throughput by adding new fast paths through the hardware and by avoiding unnecessary processing. This next-generation geometry (NGG) path is much more flexible and programmable than before.

To highlight one of the innovations in the new geometry engine, primitive shaders are a key element in its ability to achieve much higher polygon throughput per transistor. Previous hardware mapped quite closely to the standard Direct3D rendering pipeline, with several stages including input assembly, vertex shading, hull shading, tessellation, domain shading, and geometry shading. Given the wide variety of rendering technologies now being implemented by developers, however, including these stages isn’t always the most efficient way of doing things. Each stage has various restrictions on inputs and outputs that may have been necessary for earlier GPU designs, but such restrictions aren’t always needed on today’s more flexible hardware.


Vega’s new primitive shader support allows some parts of the geometry processing pipeline to be combined and replaced with a new, highly efficient shader type. These flexible, general-purpose shaders can be launched very quickly, enabling more than four times the peak end primitive shaders primitive cull rate per clock cycle.

In a typical scene, around half of the geometry will be discarded through various techniques such as frustum culling, back-face culling, and small-primitive culling. The faster these primitives are discarded, the faster the GPU can start rendering the visible geometry. Furthermore, traditional geometry pipelines discard primitives after vertex processing is completed, which can waste computing resources and create bottlenecks when storing a large batch of unnecessary attributes. Primitive shaders enable early culling to save those resources.

Primitive shaders can operate on a variety of different geometric primitives, including individual vertices, polygons, and patch surfaces. When tessellation is enabled, a surface shader is generated to process patches and control points before the surface is tessellated, and the resulting polygons are sent to the primitive shader. In this case, the surface shader combines the vertex shading and hull shading stages of the Direct3D graphics pipeline, while the primitive shader replaces the domain shading and geometry shading stages.

Geometry Engine Load Balancing with NGG

Primitive shaders have many potential uses beyond high-performance geometry culling. Shadow-map rendering is another ubiquitous process in modern engines that could benefit greatly from the reduced processing overhead of primitive shaders. We can envision even more uses for this technology in the future, including deferred vertex attribute computation, multi-view/multi-resolution rendering, depth pre-passes, particle systems, and full-scene graph processing and traversal on the GPU.

Primitive shaders will coexist with the standard hardware geometry pipeline rather than replacing it in keeping with Vega’s new cache hierarchy, the geometry engine can now use the on-chip L2 cache to store vertex parameter data. This arrangement complements the dedicated parameter cache, which has doubled in size relative to the prior generation Polaris architecture. This caching setup makes the system highly tunable and allows the graphics driver to choose the optimal path for any use case. Combined with high-speed HBM2 memory, these improvements help to reduce the potential for memory bandwidth to act as a bottleneck for geometry throughput.

Another innovation of Vega’s NGG is improved load balancing across multiple geometry engines. An intelligent workload distributor (IWD) continually adjusts pipeline settings based on the characteristics of the draw calls it receives to maximize utilization.

One factor that can cause geometry engines to idle is context switching. Context switches occur whenever the engine changes from one render state to another, such as when changing from a draw call for one object to that of a different object with different material properties. The amount of data associated with render states can be quite large, and GPU processing can stall if it runs out of available context storage. The IWD seeks to avoid this performance overhead by avoiding context switches whenever possible.

Some draw calls also include many small instances (i.e., they render many similar versions of a simple object). If an instance does not include enough primitives to fill a wavefront of 64 threads, then it cannot take full advantage of the GPU’s parallel processing capability, and some proportion of the GPU’s capacity goes unused. The IWD can mitigate this effect by packing multiple small instances into a single wavefront, providing a substantial boost to utilization.

Next Generation Compute Unit (NCU) with Rapid Packed Math

GPUs today often use more mathematical precision than necessary for the calculations they perform years ago, GPU hardware was optimized solely for processing the 32-bit floating point operations that had become the standard for 3D graphics. However, as rendering engines have become more sophisticated—and as the range of applications for GPUs has extended beyond graphics processing—the value of data types beyond FP32 has grown.

The programmable compute units (Figure 7) at the heart of Vega GPUs have been designed to address this changing landscape with the addition of a feature called Rapid Packed Math. Support for 16-bit packed math doubles peak floating-point and integer rates relative to 32-bit operations. It also halves the register space as well as the data movement required to process a given number of operations. The new instruction set includes a rich mix of 16-bit floating point and integer instructions, including FMA, MUL, ADD, MIN/MAX/MED, bit shifts, packing operations, and many more.

For applications that can leverage this capability, Rapid Packed Math can provide a substantial improvement in compute throughput and energy efficiency. In the case of specialized applications like machine learning and training, video processing, and computer vision, 16-bit data types are a natural fit, but there are benefits to be had for more traditional rendering operations, as well. Modern games, for example, use a wide range of data types in addition to the standard FP32. Normal/direction vectors, lighting values, HDR color values, and blend factors are some examples of where 16-bit operations can be used.

With mixed-precision support, Vega can accelerate the operations that don’t benefit from higher precision while maintaining full precision for the ones that do. Thus, the resulting performance increases need not come at the expense of image quality.

In addition to Rapid Packed Math, the NCU introduces a variety of new 32-bit integer operations that can improve performance and efficiency in specific scenarios. These include a set of eight instructions to accelerate memory address generation and hashing functions (commonly used in cryptographic processing and cryptocurrency mining), as well as new AOD/SUB instructions designed to minimize register usage.

The NCU also supports a set of 8-bit integer SAD (Sum of Absolute Differences) operations. These operations are important for a wide range of video and image processing algorithms, including image classification for machine learning, motion detection, gesture recognition, stereo depth extraction, and computer vision. The QSAD instruction can evaluate 16 4×4-pixel tiles per NCU per clock cycle and accumulate the results in 32-bit or 16-bit registers. A maskable version (MQSAD) can provide further optimization by ignoring background pixels and focusing computation on areas of interest in an image.

Revised Pixel Engine

As ultra-high resolution and high-refresh displays become more widespread, maximizing pixel throughput is becoming more important Monitors with 4K+ resolutions and refresh rates up to 240Hz are dramatically increasing the demands on today’s GPUs. The pixel engines in the Vega architecture are built to tackle these demands with an array of new features.

The Draw-Stream Binning Rasterizer (DSBR) is an important innovation to highlight. It has been designed to reduce unnecessary processing and data transfer on the GPU, which helps both to boost performance and to reduce power consumption. The idea was to combine the benefits of a technique already widely used in handheld graphics products (tiled rendering) with the benefits of immediate-mode rendering used in high-performance PC graphics.

Standard immediate-mode rendering works by rasterizing each polygon as it is submitted until the whole scene is complete, whereas tiled rendering works by dividing the screen into a grid of tiles and then rendering each tile independently.

The DSBR works by first dividing the image to be rendered into a grid of bins or tiles in screen space and then collecting a batch of primitives to be rasterized in the scan converter. The bin and batch sizes can be adjusted dynamically to optimize for the content being rendered. The DSBR then traverses the batched primitives one bin at a time, determining which ones are fully or partially covered by the bin. Geometry is processed once, requiring one clock cycle per primitive in the pipeline. There are no restrictions on when binning can be enabled, and it is fully compatible with tessellation and geometry shading.

This design economizes off-chip memory bandwidth by keeping all the data necessary to rasterize geometry for a bin in fast on-chip memory (i.e., the 12 cache). The data in off-chip memory only needs to be accessed once and can then reused before moving on to the next bin. Vega uses a relatively small number of tiles, and it operates on primitive batches of limited size compared with those used in previous tile-based rendering architectures. This setup keeps the costs associated with clipping and sorting manageable for complex scenes while delivering most of the performance and efficiency benefits.

Pixel shading can also be deferred until an entire batch has been processed so that only visible foreground pixels need to be shaded. This deferred step can be disabled selectively for batches that contain polygons with transparency. Deferred shading reduces unnecessary work by reducing overdraw (i.e., cases where pixel shaders are executed multiple times when different polygons overlap a single screen pixel).

Deferred pixel processing works by using a scoreboard for color samples prior to executing pixel shaders on them. If a later sample occludes or overwrites an earlier sample, the earlier sample can be discarded before any pixel shading is done on it. The scoreboard has limited depth, so it is most powerful when used in conjunction with binning.

These optimizations can significantly reduce off-chip memory traffic, boosting performance in memory-bound scenarios and reducing total graphics power consumption. In the case of Vega desktop GPUs, we observed memory bandwidth reductions of up to 33% when the DSBR is enabled for existing game applications, with no increase in power consumption.

Built for Higher GPU Clock Speeds

One of the key goals for the Vega architecture was achieving higher operating clock speeds than any prior Radeon GPU. Put simply, this effort required the design teams to close on higher frequency targets. The simplicity of that statement belies the scope of the task, though. Meeting Vega’s substantially tighter timing targets required some level of design effort for virtually every portion of the chip.

In some units—for instance, in the texture decompression data path of the LI cache—the teams added more stages to the pipeline, reducing the amount of work done in each clock cycle to meet Vega’s tighter timing targets.

Adding stages is a common means of improving the frequency tolerance of a design, but those additional stages can contribute more latency to the pipeline, potentially impacting performance. In many cases, these impacts can be minor. In our texture decompression example, the additional latency might add up to two clock cycles out of the hundreds required for a typical texture fetch—a negligible effect.
In other instances, on more performance-critical paths, the Vega project required creative design solutions to better balance frequency tolerance with per-clock performance. Take, for example, the case of the Vega NCU, the design team made major changes to the compute unit to improve its frequency tolerance without compromising its core performance.

First, the team changed the fundamental floor plan of the compute unit. In prior GCN architectures with less aggressive frequency targets, the presence of wired connections of a certain length was acceptable because signals could travel the full distance in a single clock cycle. For this architecture, some of those wire lengths had to be reduced so signals could traverse them within the span of Vega’s much shorter dock cycles. This change required a new physical layout for the Vega NCU with a floor plan optimized to enable shorter wire lengths.

This layout change alone wasn’t sufficient, though. Key internal units, like the instruction, fetch and decode logic, were rebuilt with the express goal of meeting Vega’s tighter timing targets. At the same time, the team worked very hard to avoid adding stages to the most performance-critical paths. Ultimately, they could close on a design that maintains the four-stage depth of the main ALU pipeline and still meets Vega’s timing targets.

Vega also leverages high-performance custom SRAMs originally developed by the Zen CPU team. These SRAMs, modified for use in the in general-purpose registers of the Vega NCU, offer improvements on multiple fronts, with 8% less delay, an 18% savings in die area, and a 43% reduction in power use versus standard compiled memories.

AMD Ryzen APU Topology

Employing the Zen, Vega, and Infinity Fabric technologies described in the previous section, the AMD Ryzen Processor with Radeon Vega Graphics employs the physical topology shown below (Figure 10). The Infinity Fabric services six unique clients representing different categories of technologies in the AMD IP portfolio. These clients are centrally monitored and managed via the data/control capabilities of the fabric.

Below is a die shot of the Raven Ridge APU compared to the Ryzen CPU structure.

AMD Raven Ridge APU Die Shot

AMD Ryzen Die Shot

Product Tour

Below are some images from the care package I received from AMD and the product packaging for the new Ryzen APUs. As you can see the APU and the cooler each has their own packaging. The slender box for the APU which AMD has been using for some time and a cardboard box very similar for the cooler.

This slideshow requires JavaScript.

Next up are pictures of the Ryzen APU samples we have. The APUs are packaged in the usual fashion from AMD with a plastic sleeve inside the small cardboard box and include a case badge denoting Ryzen 5 or Ryzen 3 depending on the APU inside. Moving on to the Wraith Spire CPU cooler, I can say it will keep the Ryzen 5 2400G within its thermal envelope at stock speeds but that’s about as far as it goes. During stress testing, the APU would reach temperatures in the mid-eighties which is still under the throttling limit of 95 °C but dashes hopes of overclocking far on the stock cooler. It was also a bit awkward to install, but as you can see the stock TIM shows good coverage of the IHS and there was just the right amount pre-applied.

This slideshow requires JavaScript.

Benchmarks

During the benchmarks, I wanted to give the APUs a fair shake in their own weight class but the only CPU I had available with the Intel UHD 630 graphics was an 8700K six-core twelve thread CPU that retails for over twice suggested retail price of the Ryzen 5 2400G. So I settled on the i3 8350K which retails at the same price point as the 2400G APU but I didn’t have one and wasn’t about to buy one either.

Just to be clear I’m trying to be as fair as possible so I took my i7 8700K and reduced it to a four core, four thread CPU and set it at 4.0 GHz with 3.7 GHz cache to mimic an i3 8350K as closely as possible with what I have. In the parts list I have denoted the 8350K with an asterisk and added a footnote as well but going forward I will refer to the CPU as an i3 8350K*

Ryzen5 2400G
Ryzen 3 2200G A10-7870k i3 8350K*
Motherboard MSI B350I PRO AC MSI B350I PRO AC ASUS Crossblade Ranger ASUS ROG Strix X370-E Gaming
Memory G.Skill FlareX 2×8 GB DDR4-3200 MHz 14-14-14-34 G.Skill FlareX 2×8 GB DDR4-3200 MHz 14-14-14-34 G.Skill 2×4 GB DDR3-2400 10-12-12-31 G.Skill FlareX 2×8 GB DDR4-3200 MHz 14-14-14-34
HDD Samsung 120 GB 840 EVO Samsung 120 GB 840 EVO Samsung 120 GB 840 EVO Samsung 120 GB 840 EVO
Power Supply Super Flower 1000W Platinum Super Flower 1000W Platinum Super Flower 1000W Platinum Super Flower 1000W Platinum
iGPU Radeon Vega 11 Radeon Vega 8 Radeon R7 512 Shaders Intel UHD 630
Cooling AMD Wraith Spire AMD Wraith Spire Noctua NH-D15 Noctua NH-D15
OS Windows 10 x64 Windows 10 x64 Windows 10 x64 Windows 10 x64

*i3 8350K=i7 8700K@ 4.0 GHz, four cores and four threads to simulate an i3 8350K

In the review package from AMD, we find parts from MSI and G.Skill for the Ryzen APU review. The motherboard delivered was the MSI B350I PRO AC, a full-featured Mini-ITX AM4 motherboard and for RAM, the package from G.Skill includes two FlareX DIMMs, this 2×8 GB kit is rated for DDR4-3200 at 14-14-14-34. Shortly after the Ryzen launch last year G.Skill released the FlareX and Fortis RAM which are specifically tuned for the AM4 platform. The full contents of the review package are pictured below.

This slideshow requires JavaScript.

The MSI B350I PRO AC, as I said is a mini-ITX form factor AM4 motherboard, don’t let its size fool you. MSI has packed a few goodies into that small real estate. The board is equipped with the B350 chipset and supports most AM4 CPUs currently available. I did notice however that the Ryzen 7 1800X wasn’t on the list likely due to TDP restraints. The two DRAM slots will support up to 32 GB of DDR4 in dual channel with speeds up to 3200 MHz. There’s one PCIe 3.0 x16 slot on the board and an M.2 connector on the back which supports PCIe 3.0 x4 and SATA NVMe drives. These PCIe connection speeds will depend on the CPU used. With the Ryzen CPUs they run full speed but with the reduced lanes in the APU they are both reduced to half that’s PCIe 3.0 x8 and x2 respectively. The MSI PRO AC also has USB 3.1 Gen2 connectivity, HDMI, and Display Port output and also includes Intel dual-band wireless/Bluetooth.

This slideshow requires JavaScript.

Benchmarks Used

All benchmarks were run with the motherboard being set to optimized defaults (outside of some memory settings which had to be configured manually). When “stock” is mentioned along with the clock speed, this includes the precision boost 2 on the AMD Ryzen APUs. I tested this way to observe AMD’s updated Precision Boost 2 and how it manipulates the clock speeds when under different loads. I’d also like to reiterate the fact that I used an 8700K pared down to an 8350K performance level set at a static speed of 4 GHz with four cores and four threads. All onboard graphics were left at stock speeds for this testing.

After the testing, I set the AMD APUs to their maximum overclock for the CPU and iGPU that I could obtain on the MSI motherboard. I did find it had some limitations with voltage so I could only go so far. This will give you an idea of the possible performance gains to be had if you choose to overclock the APU. Memory was kept at the rated speeds for the FlareX DDR4 kit.

CPU Tests
  • AIDA64 Engineer CPU, FPU, and Memory Tests
  • Cinebench R11.5 and R15
  • x265 1080p Benchmark (HWBOT)
  • POVRay
  • SuperPi 1M/32M
  • WPrime 32M/1024M
  • 7Zip

All CPU tests were run at their default settings unless otherwise noted.

Gaming Tests

All game tests were run at 1920×1080 on low presets for the benchmarks and verified V-Sync was disabled.

  • 3DMark Fire Strike
  • Middle Earth: Shadow of Mordor
  • Metro Last Light
  • Ashes of the Singularity
  • Rise of the Tomb Raider

AIDA64 Tests

Just a note here, I used the latest AIDA64 Engineer Beta for testing but it still doesn’t officially support the Ryzen APU. First up is the AIDA64 cache and memory benchmark.

AIDA64 Cache and Memory Benchmark

 

AIDA64 Cache and Memory Benchmark
CPU Read Write Copy Latency
Ryzen 5 2400G @ 3.6 GHz 46981 47573 41578 68.8
Ryzen 3 2200G @ 3.5 GHz 46750 47660 41652 67.3
Intel i3 8350K* @ 4.0 GHz 47089 47755 42998 44.6
AMD A-10 7870K @ 3.9 GHz 22236 12393 21413 77

As you can see the Ryzen is working much better with ram than it was a year ago but that latency is still quite high when comparing to Intel. Up next the AIDA64 CPU benchmarks.

AIDA64 CPU Tests

 

AIDA64 CPU Tests
CPU Queen PhotoWorx Zlib AES Hash
Ryzen 5 2400G @ 3.6 GHz 46989 18366 356.4 33069 11308
Ryzen 3 2200G @ 3.5 GHz 30895 13768 228.1 29018 7335
Intel i3 8350K* @ 4.0 GHz 36091 26878 285.2 17698 4375
AMD A-10 7870K @ 3.9 GHz 18809 9791 175.3 8685 2745

As you can see the four extra threads gave the 2400G a nice advantage through most of the CPU tests and the 2200G wasn’t all that far behind. On to the last of the AIDA64 benchmarks.

AIDA64 FPU Tests

 

AIDA64 FPU Tests
CPU VP8 Julia Mandel SinJulia
Ryzen 5 2400G @ 3.6 GHz 6170 19247 10051 6362
Ryzen 3 2200G @ 3.5 GHz 5786 18411 9588 4367
Intel i3 8350K* @ 4.0 GHz 6611 33031 18289 3391
AMD A-10 7870K @ 3.9 GHz 3788 6240 3184 1484

The floating point tests seem to be a bit of a weak spot for the Ryzen based APUs even with the extra threads the 2400G was left behind in all but the SinJulia test.

Real World Tests

Next, we will move on to something a bit more tangible/productivity based with compression, rendering, and encoding benchmarks.

Cinebench R11.5/R15, POVRay, x265 (HWBot), 7Zip

 

Cinebench R11.5/R15, POVRay, x265 (HWBot), 7Zip – Raw Data
CPU R11.5 R15 POVRay x265 7Zip
Ryzen 5 2400G @ 3.6 GHz 9.23 826 1702.86 19.84 21913
Ryzen 3 2200G @ 3.5 GHz 6.66 585 1374.92 17.04 16115
Intel i3 8350K* @ 4.0 GHz 7.82 683 1665.39 27.13 19203
AMD A-10 7870K @ 3.9 GHz 3.71 326 857.51 9.32 11912

Here again, the extra threads gave the 2400G a bit of an advantage over the Intel CPU in all but HWBot’s X265 benchmark. The last generation of Intel CPUs got a real boost in this benchmark when compared to their predecessors.

Pi-Based Tests

Moving on from all the multi-threaded goodness above, we get to some Pi and Prime number based tests. SuperPi and WPrime, specifically. Even though AMD isn’t particularly strong in these benchmarks you can see there’s a vast improvement over their Steamroller counterpart.

SuperPi and wPrime Benchmarks

 

SuperPi and wPrime Benchmarks – Raw Data
CPU SuperPi 1M SuperPi 32M wPrime 32M wPrime 1024M
Ryzen 5 2400G @ 3.6 GHz 11.066 625.307 6.425 181.54
Ryzen 3 2200G @ 3.5 GHz 12.114 671.124 8.863 271.828
Intel i3 8350K* @ 4.0 GHz 9.141 461.783 6.939 221.602
AMD A-10 7870K @ 3.9 GHz 17.891 957.549 12.298 392.797

Game Results

As far as the games go, tests were done at 1080p and low presets. These APUs are meant to be an affordable all-in-one solution so gaming isn’t their primary purpose but as you’ll see it is doable with acceptable frame rates. For the gamers out there, you definitely won’t be disappointed with the performance of the Radeon Vega graphics!

1080p Gaming Results

As for the synthetic benchmark, 3DMark Fire Strike, you can see the results are similar to the graph above with the Vega graphics pulling in some impressive numbers for an iGPU.

3DMark Firestrike Benchmark

 Precision Boost 2

Just a few words on my observations of AMD’s improved boost function. First off it does behave differently than in their first iteration of Precision Boost. I’ll start with the Ryzen 3 2200G since the two CPUs behaved slightly different from each other. The 2200G has a base clock of 3.5 GHz and boost to 3.7 GHz and from what I saw it stayed at its top boost frequency on all four core even under heavy loads such as Cinebench R15 or HWBot x265. During single thread operations, it would boost one or more cores up to 3.7 GHz but the load appeared to move between different cores, it almost seemed erratic. I retested and set affinity to a single core so it was the only one that boosted, that core stayed at 3.7 GHz during the full test but the benchmark scored the same so the stock behavior didn’t affect the outcome.

The Ryzen 5 2400G behaved a bit different in that it didn’t reach full boost on all cores during heavy, multithreaded loads but would boost from a base speed of 3.6 GHz and hover between 3.75 GHz and 3.8 GHz. During light loads, it did, however, reach its full boost clock speed of 3.9 GHz.

Overclocking

For overclocking I switched coolers from the included Wraith Spire to the Noctua NH D-15. The 2400G was already close to its thermal limits running at stock with the stock cooler. The 2200G, on the other hand, had a bit more headroom and I did manage to do stability testing at 4.1 GHz with the included Wraith Spire cooler which was quite surprising, to say the least. I did try running it with the Noctua cooler but again I ran into the limit of the motherboard which wouldn’t allow me to set the core voltage above 1.4V. It could have been my sample but the Ryzen 5 2400G was nearly at its limits even with improved cooling. I only managed to get 3.95 GHz max for the CPU core and 1350 MHz on the GPU core. It was stable at these settings but I really didn’t get the graphics improvements I was hoping for. As you’ll see in the following results gaming only improved one or two frames per second. The Ryzen 3 2200G seemed to have a lot more headroom. I managed 4.1 GHz for the CPU core and 1300 MHz for the GPU core and only needed a slight boost to the SOC/NB voltage to get that extra 200 MHz from the APU.

So let’s see how they stacked up.

Cinebench R11.5/R15, POVRay, x265 (HWBot), 7Zip

SuperPi and wPrime Benchmarks

1080p Gaming

3DMark Firestrike Benchmark

Power Consumption and Temperatures

In the graph below we tested power use of the system across multiple situations from idle, to Prime 95 Small FFT (with FMA3/AVX) to 3DMark Firestrike for a combined load. The system, at stock, was pulling a maximum of 104 W for CPU only load conditions with the 2400G and the 2200G maxed out at 93 W both these results were from the Prime 95 small FFT test. Even when overclocking the 2400G only made it up to 122 W and 93 W for the 2200G at 4.1 GHz. Keep in mind this is full system power usage, these CPUs definitely sip the electricity.

Ryzen APU Power Consumption Stock

Temperatures were actually surprisingly well-controlled with the included Wraith Spire cooler, I saw no throttling at any point. The highest temperature when at stock was 85 °C with the Ryzen 5 2400G, during Prime95 Small FFT. This shows that the stock cooler is adequate for the 2400G at stock settings but it wouldn’t hold up during overclocking. For the Ryzen 3 2200G it worked well all the way to 4.1 GHz under stress testing mind you the CPU was reaching and slightly passing 90 °C during P95 small FFT testing but it’s all you would need.

Ryzen APU Temperatures at Stock

Conclusion

Overall, the performance from AMD’s newest addition to the Ryzen stable is quite impressive. When comparing them to the A10-7870K, from a computing perspective there really is no contest and the graphics performance has nearly doubled. Looking at the numbers and the dollars the Ryzen 5 2400G compares quite favorably to the 8350K which is in the same ball park, price wise, with an MSRP of $169.99. It has four more threads to aid in multi-threaded workloads and a pretty decent graphics processor if you feel like kicking back and doing some light gaming. Personally, I feel the real sweetheart is the $ 99.99 Ryzen 3 2200G and I wouldn’t be surprised to see it show up in a lot of OEM machines in the near future.

I think AMD has hit the nail on the head this time, these two CPUs fit their intended market just like a glove. They’re offering the best of both worlds with great performance that won’t break the bank. Overclockers Approved!

Click here to find out what this means.

Click here to find out what this means.

Shawn Jennings – Johan45

Leave a Reply

Your email address will not be published. Required fields are marked *

Discussion
  1. They're definitely based on zen 14 nm parts. Zen+ coming in aprilish 12nm refresh mostly a shrink with some minor improvements and patch for the recent bug scare
    I was an AMD fanboy when I got here. Then I finally accepted Intel's overwhelming dominance and got a Team Blue chip that was a very good chip. And in a little over a year AMD is knocking (banging, actually) on my decently OC'd door. The next time Team Red comes to the door they're likely to be kicking the door in and messing the place up. Smh.
    Scores can down, up... it will depend. ;)

    Correct, these are APUs...this is NOT Zen+ which is said to be an update and better architecture and also said to overclock better. For all intents and purposes, its Zen with a vega gpu inside. Thats it. :)
    Johan45
    Hey Mack I did have that in the table near the beginning of the review.


    (if I keep quiet maybe he wont notice I didn't read every word...) :chair:

    Woomack
    I actually thought they will OC better as 2k series Ryzen was promised to OC better than the 1k series and so far both look about the same.

    2200G is something like 1200+IGP and for many users that IGP makes the difference... especially that price of both is about the same.


    My understanding is these APUs are enhanced Zen. The as yet unreleased 2000 parts will be the more interesting Zen+ parts that hopefully have more headroom.

    EarthDog
    Aida64 doesnt officially suppprt this cpu...keep that in mind when trying to compare results. ;)


    Fair enough, to a point, but I'm not sure there'll be that much difference with optimisation.
    Alaric
    I just ran the Aida CPU benches in the review (Julia, SinJulia, VP8 and Mandel) and I'm impressed by the Ryzen's performance. And more than a little upset. In all but the Mandel my Skylake at 4.7 GHz got its butt kicked-badly. Are the Ryzens that much better??? It looks (to me) like AMD is back, with a vengeance.


    My 6700k at stock (4.0 GHz with 4.2 one core turbo) beats the Ryzens apart from SinJulia.

    For comparison I got:

    vp8 7298

    julia 33973

    mandel 18291

    sinjulia 4816

    For SinJulia, if I look at the example results given in Aida64, my score is within tolerance of the 6700k they have there, but they're suggesting a 3770k at 3.5 GHz base 3.9 turbo is faster?

    What does each do? https://www.aida64.com/products/features/benchmarking

    In short:

    VP8 - who knows

    Julia - 32-bit (single precision)

    Mandel - 64-bit (double precision)

    SinJulia - 80-bit (old x87)

    Now, my concern here is for Mandel they list "x87, SSE2, AVX, AVX2, FMA, and FMA4" as instructions used. I don't see FMA3 in there. The thing is, FMA3 was introduced with AVX2 in Haswell, and I'm bad at using them as if they're the same thing, but I'm not sure they technically are. If FMA4 is of benefit, FMA3 should be too. If FMA3 isn't used, it'll be missing out on a significant boost for Haswell or newer, and Ryzen can also try to take advantage of it. It supports it, but I'm not convinced it is effective. Similar story with FMA4 on older AMD processors.

    I've not really looked into x87 as there are few things that really need it now. But in what little I have done, is yet another prime number finding software called genefer. This is NOT based on the Prime95 codebase, but does have some similarities in that it does FFTs also. For a particular type of number, due to its size the extra precision of x87 was required over 64-bit, yet it used small FFTs. This should allow running out of cache, so isn't ram limited, and in that I saw a similar gap in IPC between Ryzen and Intel as for 64-bit. So SinJulia is behaving nothing like that. Further looking at the Aida64 example results, the 4770 and even 2600 (both non-K) are not far behind the 6700k, despite their lower clocks and older architectures. The only commonality I can see is they're all DDR3 systems. I'm wondering if there is some other feature of the CPU that helps these results more than the cores, and I'm thinking ram access in some way.
    Must have been something wonky the first time I ran the benches. I mopped the floor with everything on the FPU charts except SinJulia, where the 2400G fairly cleaned my clock. LOL I feel better now. Unless I think of the cost and effort involved. Then I get a little less smug. LOL
    I actually thought they will OC better as 2k series Ryzen was promised to OC better than the 1k series and so far both look about the same.

    2200G is something like 1200+IGP and for many users that IGP makes the difference... especially that price of both is about the same.

    I will try to get 2200G and make some tests vs 1200 and i3 8100. Will see if it end on full article or not.
    Hey Mack I did have that in the table near the beginning of the review.

    Just too bad they run out of gas real quick Alaric. If they could hit 5.0 like the blues...........
    Lol, would have gotten a free vega 9 and faster cpu cores for less than what I paid recently for the kids' 1200's (99USD). MC is selling 1200s for 5 bucks more than the 2200G's right now.
    I just ran the Aida CPU benches in the review (Julia, SinJulia, VP8 and Mandel) and I'm impressed by the Ryzen's performance. And more than a little upset. In all but the Mandel my Skylake at 4.7 GHz got its butt kicked-badly. Are the Ryzens that much better??? It looks (to me) like AMD is back, with a vengeance.
    To me it seems to have potential but doesn't quite run away with it against blue and green. Depending if you focus more on price, performance or power consumption, there are alternate things you can do comparable to it. Where it could be a hit is in lower end SFF systems, with the Intel/Vega taking the higher end. Budget gaming system is a bit of a stretch still, unless you stick to older or relatively undemanding games.

    I've been planning a retro case mod since... forever. The original plan was i3-4150T (2c4t 3.0GHz 35W) and pair that with a 750Ti. In particular I wanted to use a Pico-PSU to keep size minimal, and settled on a 150W model, that on reading the small print could only do 8A continuous on 12V rail (96W). Under stress conditions... that combo got rather closer to it than I'd like so I hit the pause button. Forward a bit, I saw a GT1030 on sale and got that to replace the 750Ti. I knew it was a little slower, but at half the power, I couldn't refuse. I didn't do anything with that either.

    So now, we have the 2400G. On average, it seems to be comparable to a 1030 with either taking a lead depending on if it is an red or green favouring game. I'm debating if this should figure into my plans now. Way I see it, it has two potential benefits. One is that the APU is 65W. Comparable to the i3+1030 separately, but I save on space by not having a dGPU even if I can move it around with extenders or whatever. I don't actually know if the title I'm thinking of is AMD friendly or not though, so that'll take some research. Thing is, I know the 750Ti was on the lower end of acceptable already. 1030 might already be dropping below that. Previous testing suggested it wasn't very CPU demanding, but with updates since then that could have changed.

    Edit: forgot to say... watching one of the videos out there on these products, they claimed AMD have also switched to TIM as opposed to soldering on these. I've only seen that claim in one place, so could use some verification, although I have no reason to disbelieve them. Assuming so, who's gonna be first to delid one of these and stick some LM on?

    Edit 2: ok, if you google "2400g tim" you get a ton of sites all saying it uses TIM not soldered. I think that's safe to know now.
    I was comparing Ryzen 1200 to anything from intel in similar price about 2 months ago. i3 cost a bit more, motherboard with coffee lake support cost more ... and now 2200G will cost about as much as Ryzen 1200 but with GPU which is not so much slower than the GF1030. I think it's great price for total performance. 2400G is worse idea if we look at price/performance of whole package.

    I don't think that most users who get these APUs will play any games on them. I also count that soon will be BIOS updates that will improve performance.
    I guess that would depend on the case you chose since the cooler is only 2" tall and I did say that the included cooler did fine under normal loads even Prime95 but the 2400G was running out of headroom.
    Yeah, Dave but you ain't going to get the total computing power with an i3 that you will get with the Ryzen 3200G and all those cores/threads. What you say would be more or less true for gaming though. Right now GT 1030s are going for $85-$100 and the i3 coffe lake for about $130. That's Well over $200 compared to $190 for the 3200G. And many or most people will go water anyway on a mini ITX build, especially if they are overclocking.
    Sorry, I'm not impressed. For close to the same price, I could get an i3 with a GT 1030 with comparable performance. GT 1030s come in single-slot, low-profile so they can fit in the smallest cases. Plus the AMD stock cooler is too tall for very small ITX cases, and inadequate under maximum loads which is one problem I was looking for. An i3 with it's low-profile stock cooler paired with a low-profile, single-slot GT 1030 would be a much better fit for a tiny ITX boax.