• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

Ryzen 9 7900x3d: why not SRAM chiplet on both CCDs?

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

magellan

Member
Joined
Jul 20, 2002
According to the tomshardware review only one of the two CCDs on a 7950X3D or 7900X3D has the 3D V-cache chiplet attached? Why not both? Is this why some people disable a CCD when gaming? Will the next gen X3D CPUs have the 3D V-cache chiplet attached to all CCDs on a given interposer/substrate? If Windows 10 or Windows 11 can preferentially schedule gaming loads to the X3D equipped CCD chiplet doesn't that imply you're running your games on a 7900X3D 6-core/12-thread CPU that doesn't boost as high as the CCD without the X3D chiplet (and thus less L3 cache)?

I still don't understand why AMD doesn't go w/mackerel's suggestion and put the L4 on IOD chiplet (and maybe even a larger X3D L4 cache than the X3D L3 caches), because that would be a win for all CCDs and maybe even for the iGPU.
 
The primary reason is heat and turbo/boost performance. The X3D chips run hotter because they are not as close to the heatsink as a normal chiplet. This means that the top end boost is lower as AMD cant let the chip cook itself.

They felt that having one chiplet with and one without the stacked cache was the best of both worlds for most workloads, letting things that benefit from the extra cache run on the X3D chiplet, and letting things that scale best with GHz run on the normal chiplet.

Now, as to why its not on its own island chiplet with access to both chiplets... space, latency, cost. All 3 would go up, and none would improve performance, your better accepting the heat penalty in exchange for having the cache directly on top of the die.

What your referencing is HBM or high bandwidth memory on package and that is the direction Apple and Intel are going and I would expect at some point AMD will find value in it as well, but I think they are still a little burned by Vega and Radeon 7 when it comes to on package HBM.
 
The way they sold it was that you get the best of both worlds. High clock cores for stuff that likes it. Cache for stuff that likes that. If you assume OS scheduling and/or software can put the work on the right cores.

Now, as to why its not on its own island chiplet with access to both chiplets... space, latency, cost. All 3 would go up, and none would improve performance, your better accepting the heat penalty in exchange for having the cache directly on top of the die.
Cost and space are no worse than X3D. The main problem with the approach is that AMD's CPU Infinity Fabric is not very high bandwidth and that would limit any benefit. So for workloads that rely on unified access to data they often have to fall back to ram as there's no good way to share large amounts of data closer to the cores.
 
I was thinking since the IOD chiplet is larger than the CCDs and because both CCDs have to go through the IOD to get to memory then why not just put a larger, L4, X3D cache chiplet on top of the IOD chiplet? Maybe this is exactly what mackerel was proposing in another thread.

The IOD chiplet never gets as hot as the CCD chiplets does it?
 
Not sure tbh about any temperature difference. But the latency of the infinity fabric, which is already impressive given how they are building these chips, would eat up most of the value of having a large directly attached high speed cache.

I still think HBM might be the logical next step but we shall see, with all the enterprise stuff going to CXL I could also see a future where chips have massive cache and then just hop on a CXL link to the rack of ram next to the compute rack.
 
In my proposed implementation the cache at IOD level doesn't have to be as fast as L3, made up for in size. It just had to be faster/lower latency than system ram so multiple CCX can more easily share bigger data sets without a trip to ram. Note it doesn't replace L3, it would still be present closer to the core.

HBM isn't a great option for consumer level CPU cores, because it is a very wide bus at low clock. Latency would be bad. It only makes sense in bigger data cases, perhaps as a GPU accelerator. It makes more sense in enterprise level CPUs since they often have a lot more hungry cores to feed.
 
If mackerel can manufacture his CPU -- I'll buy it! The Mackerel architecture! I guess it would have to be X3D cache, I'm not even sure if SRAM is manufactured anymore is it? I'm guessing SRAM chips would be too big for the substrate anyway.
 
I'm not even sure if SRAM is manufactured anymore is it? I'm guessing SRAM chips would be too big for the substrate anyway.
SRAM is 100% still used. That's what the cache is. Size has nothing to do with it. Keep in mind the quantities we're talking about here; 10s of MB, not GB.

Come a long way since the external SRAM SIMMs or DIMMs used back in the day.
 
Last edited:
SRAM is 100% still used. That's what the cache is. Size has nothing to do with it. Keep in mind the quantities we're talking about here; 10s of MB, not GB.

Come a long way since the external SRAM SIMMs or DIMMs used back in the day.
I think PC system ram has always been DRAM, not SRAM. In the early days that I can remember it was SDRAM, a type of DRAM, not a type of SRAM. SRAM might have been used when cache was installed as external chips to the CPU. DRAM is used where capacity matters, as it can be implemented using a single transistor per bit. SRAM is more for performance as it doesn't have limitations of DRAM, but needs many transistors per bit. Cost of SRAM per capacity could be order of magnitude higher.
 
I think PC system ram has always been DRAM, not SRAM. In the early days that I can remember it was SDRAM, a type of DRAM, not a type of SRAM. SRAM might have been used when cache was installed as external chips to the CPU. DRAM is used where capacity matters, as it can be implemented using a single transistor per bit. SRAM is more for performance as it doesn't have limitations of DRAM, but needs many transistors per bit. Cost of SRAM per capacity could be order of magnitude higher.
Well, we're going way back here; pre-286 days. Think Sinclair or Commodore.
 
On the early 8088, 80286 systems you had the option of going w/512KiB of RAM instead of 640KiB. You could add DIPs to such systems up to 640KiB, but this was before my time. I even saw one unusual 80286 system that had stacked DIPs, I believe that was because the 80286 address space extended up to 16MiB and that was the only way to add more than 640KiB in DIPs to the system (although there was most certainly more to it than that).
 
Back