The Problems With SOI . . .

Add Your Comments

Yesterday, we said we would talk about the prospects of AMD getting competitive K10 chips out.

We’ve suspected based on circumstantial evidence that AMD’s core problem has been the 65nm SOI process.

Now someone has come up with some hard evidence that this in fact has been the case.

You really should look at the link; it’s excellent detective work, but what the gentleman did was go to the AMD datasheets and compare the amperages used by various 65nm and 90nm AMD chips when in Halt/Stop Grant states (i.e., an idle state for CPUs).

What he found was that:

  • the power consumed by various AMD chips was all over the place, with the lower-end chips chewing up much more power in idle than the faster ones
  • the degree of variation was even worse with 65nm chips than with 90s; the high-end 65s are better than the high-end 90s, but the low-end 65s are worse than the low-end 90s.

    As he put it:

    “It looks to me like AMD splits 65nm parts into three buckets: let’s call them low, medium, and high leakage.

    1. The low leakage parts are used for the 4800+ and 5000+ products. They are lower leakage than AMD’s best 90nm parts. This is good news. They draw about 8% less current at 1.1V, and about 23-28% less current at 1.35V.

    2. The medium leakage parts are used for the 4000+ and 4400+ products. They are somewhat as high in leakage than AMD’s leakiest parts on 90nm, and definitely leakier than their 90nm median parts. The 4000+ part, for example, draws more current at ‘1.1V than AMD’s downbinned 3800+ part on 90nm. At 1.325V, they are drawing more current than AMD’s high end 90nm parts at 1.35V. This is certainly not good news, and suggest that the median of AMD’s 65nm process leakage is worse off than at 90nm.

    3. The high leakage parts are downbinned to the 3600+ chip. Although this part has been removed from the current lineup, it’s not clear whether they are still producing these and selling them as 4000+ parts, or whether their process has improved. At any rate, these parts are insanely leaky. A 1.1V, they are drawing almost 50% more current than AMD’s worst 90nm part. And good thing AMD restricted the voltage to 1.3V, because even at this voltage, the leakage towers over the entire 90nm product line.

    I think these results are pretty interesting, and may explain why AMD has not been able to ramp 65nm. The leakage is killing them, and only their lowest leaking parts are able to hit 2.6GHz at 1.35V, and still maintain a reasonable power envelope.

    There was a good followup post which points out that leakage seems to explode once you go past a certain point. That author didn’t document it, but if you go to pages 28 and 31 of the AMD datasheet being used for all this, you’ll find that the 3.0GHz 90nm part chews up almost double the power of the 2.6GHz 90nm part, and much more than double the power in the Halt/Stop Grant state.

    That’s not suprising given what IBM said about SOI a few years back (unfortunately the IBM links no longer work, but the gist of what they said is in the article). SOI doesn’t take well to high voltage.

    Yes, the datasheet is dated February. Yes, AMD is probably doing a bit better now, at least in the product mix.

    But AMD has struggled to make SOI chips fast for as long as they’ve been using SOI.

    What’s going to happen? Well, AMD has been spending a lot more on R&D lately than they did even a year ago, about 70% ($180 million) more. Yes, that figure now includes ATI, but ATI was spending about $90 million on R&D per quarter before the merge.

    So AMD is now spending at least $90 million more on R&D than it used to. Significantly, that figure went up $40 million from the previous quarter, when one might think AMD would be trying to cut costs.

    I could be wrong, but I suspect this increased R&D “spending” may really mean “writing bigger checks to IBM for SOI help.” Not that this is a bad thing.

    Whether right or not about IBM, it’s the SOI process that’s going to make or break AMD the next year. All else is secondary.

    P.S. As an extra bonue, the gentleman who did the power analysis also included this gem:

    Also, I found something interesting in one of the other AMD datasheets I was looking at. Ever wonder why AMD’s Brisbane chips do so much better in idle power dissipation tests done by reviews…? It’s because they enabled a new mobile sleep state on it.

    http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/32559.pdf

    Check out Table 64 on page 278. Previously, desktop chips supported no better than C1 Halt state. Starting with G-step (Brisbane), they now support C3.

    Interesting that AMD has needed to start enabling mobile sleep states on their desktop parts. Intel’s mobile parts support all the way down to C4E, while their desktop parts only support C1E.

    Excellent work indeed.

    Ed


  • Leave a Reply

    Your email address will not be published. Required fields are marked *