Reverse Hyperthreading? . . .

Some days ago, AMD mumbled something to somebody about coming up with something called “reverse hyperthreading,” which would make multiple CPUs look like one to a single-threaded program.

This was supposed to be a feature that would show up in the K10 series of chips (that means 2008 or later).

Well, thanks to an astute Australian reader (Paul Harry, by name), I can tell you that somebody else has been putting a lot of work into this: NEC.

You really should take a look at the webpage, but what it describes is the only way you could do such a thing: it automatically parallelizes single-thread programming, a sort of distant cousin to what Transmeta chips do.

I don’t know if AMD is going to license NEC’s technology, or do it on their own, but in the end, that doesn’t really matter. It’s the presence of the technology that counts.

This is one of those, “on the one hand, on the other hand” technologies.

On the one hand, like Intel’s hyperthreading, this is unlikely to provide a big general boost. If a computing task is not amenable to parellization, a parallizer isn’t going to help at all and might actually hurt a bit.

Even when a task is amenable to parallelization, any automatic parallelization will be less efficent than optimal manual programming, claims by NEC notwithstanding.

On the other hand, even if an automated program could do just 50% of what an optimally programmed parallized applications can do, optimally programmed parallized programming is terribly difficult to do, takes a lot of time, and is very expensive. It’s like writing machine code as opposed to using a high-level language/compiler. The first is better, but the second is much more affordable and practical.

Of course, if NEC’s claims that its technology is much better than typical manual programming, that would be a whole lot better. The point here is not to doubt NEC, but to indicate that even if the technology is relatively lousy at what it does, it still would be very, very appealing to developers.

Perhaps more importantly, the CPU companies are hellbent on giving us two, and later, four-core processors. It is pretty difficult to really justify a dual-core processor for a Joe Sixpack machine, never mind four cores a few years down the road. Technology like this would at the least make it a lot easier to sell such devices to the average person.

I suspect NEC’s initial reason for developing such technology is to sell it, not to stick in some PC, but to XBox and PS3 developers who have to deal with multiprocessor programming, now. It certainly could be used by PC developers, too, indeed, it’s hard to imagine how PC software developers wouldn’t jump on it.

Putting this into a homebody’s CPU core may be quite a different matter. If the developers jump on this (which they will), there’s no need for it. If this is introduced in an AMD CPU before the developers jump on it, well, it’s one thing to do a little debugging of an application in a development environment, it’s quite another to do it at home.

It makes more sense for a developer to run the optimization once, then release the software than make each end user run the optimization and rewrite the code, or make each end user run the optimization each and every time the code is executed.

I suppose any parallelization in a home CPU could have safeguards built in to do only foolproof optimizations, but that would reduce the performance improvement.

Let’s leave aside any possible AMD development several years from now for the moment. What is important, what is very important is that NEC’s work is the equivalent of developing a compiler for parallel processing, and, at worst, provided it works reasonably well, will become as critical a tool for the computing industry in the next few years as a C++ compiler is today.

Maybe that’s what AMD is really doing, developing a parallel compiler (and making hardware adjustments to accommodate it) just like Intel provides compilers (which aren’t necessarily too nice to non-Intel CPUs).


Be the first to comment

Leave a Reply