- Joined
- Mar 7, 2008
(am I going crazy, I thought I posted this already, but I don't see it. Trying again. Maybe I need more coffee)
This is mainly about the recently released AI products and we may be some way off info for the gaming parts. Still it may offer a clue of what's happening.
Blackwell die size seems to be pushing TSMC's reticle size limit. Nvidia said it was 2x Hopper, which was already close to that limit. The 2x coming from the two chips glued together. Plugging this into a yield calculator, we'd expect around 50% yield (defect-free dies) of a possible 60-ish per wafer, depending on exact dimensions and placement.
Wafer pricing is the next question. Back a step, it is on 4NP process, an evolution of nvidia's custom 4N process at TSMC. Note TSMC's general nodes are Nx, not xN. Previously I used the first link below to estimate pricing. I just found the 2nd link which has different numbers. 4N class is based off N5 so I'm assuming it will be similar or possibly a bit more given it is custom. If we take the higher end value of $17k, that puts a per die cost of just under $300 each, good or bad. Again, around 50% are expected to be defect free. There may be some more loss due to binning, but cut down offerings will soak those up. GB200 seems to get the best dies, with B200 and B100 getting lower quality ones. This doesn't account for packaging costs or HBM.
https://www.techpowerup.com/272267/alleged-prices-of-tsmc-silicon-wafers-appear#g272267-2
https://www.tomshardware.com/news/tsmc-expected-to-charge-25000usd-per-2nm-wafer
Why not some variation of N3? Cost? Capacity?
Nvidia claim 10TB/s bandwidth over the die to die connection. I think this is the highest claimed of any product we have visibility of in the computing space. Apple's M2 Ultra is the other example I can think of, with a claimed 2.5 TB/s. Especially given it is a consumer tier product, that's something. Intel's Sapphire Rapids offering could be interesting to compare but I've been unable to dig up numbers for its internal bandwidth. RDNA3 GPUs split MCDs from the GCD, and claim a peak bandwidth of 5.3TB/s, but this isn't connecting multiple execution dies together so it will never scale as much.
The GPU goal must be to have multiple chips working as one to help enable better performance scaling without the pain SLI/Crossfire has.
This is mainly about the recently released AI products and we may be some way off info for the gaming parts. Still it may offer a clue of what's happening.
Blackwell die size seems to be pushing TSMC's reticle size limit. Nvidia said it was 2x Hopper, which was already close to that limit. The 2x coming from the two chips glued together. Plugging this into a yield calculator, we'd expect around 50% yield (defect-free dies) of a possible 60-ish per wafer, depending on exact dimensions and placement.
Wafer pricing is the next question. Back a step, it is on 4NP process, an evolution of nvidia's custom 4N process at TSMC. Note TSMC's general nodes are Nx, not xN. Previously I used the first link below to estimate pricing. I just found the 2nd link which has different numbers. 4N class is based off N5 so I'm assuming it will be similar or possibly a bit more given it is custom. If we take the higher end value of $17k, that puts a per die cost of just under $300 each, good or bad. Again, around 50% are expected to be defect free. There may be some more loss due to binning, but cut down offerings will soak those up. GB200 seems to get the best dies, with B200 and B100 getting lower quality ones. This doesn't account for packaging costs or HBM.
https://www.techpowerup.com/272267/alleged-prices-of-tsmc-silicon-wafers-appear#g272267-2
https://www.tomshardware.com/news/tsmc-expected-to-charge-25000usd-per-2nm-wafer
Why not some variation of N3? Cost? Capacity?
Nvidia claim 10TB/s bandwidth over the die to die connection. I think this is the highest claimed of any product we have visibility of in the computing space. Apple's M2 Ultra is the other example I can think of, with a claimed 2.5 TB/s. Especially given it is a consumer tier product, that's something. Intel's Sapphire Rapids offering could be interesting to compare but I've been unable to dig up numbers for its internal bandwidth. RDNA3 GPUs split MCDs from the GCD, and claim a peak bandwidth of 5.3TB/s, but this isn't connecting multiple execution dies together so it will never scale as much.
The GPU goal must be to have multiple chips working as one to help enable better performance scaling without the pain SLI/Crossfire has.