Umm, cV, that's what a GPU is (used) for
A few comments on the article itself:
{} Unmatched-speed CPUs: This can already be done on a dual-mobile-athlon system, with the small problem that Windows stuffs up if you try to do it, and Linux stuffs up unless you disable RDTSC timekeeping. This is obviously just a simple software issue.
{} Putting a Barton and an A64 on the same CPU doesn't help much. The only way for the two to work together (access the same RAM) would be to re-engineer the Barton to use hypertransport for memory access, at which point you've more or less got a dual-core hammer. Having a dual-core where you can shut down one core (which can *almost* be done with dual Athlon systems via halt disconnect) or step down the frequency/voltage (can already be done on dual Athlon systems) is a much more practical/efficient way to go. On the Intel side, though, a Dothan/Presscott combination would be good (as long as Intel didn't throw out the MP capability when they were moving from the P3 design), since the Dothan probably won't be as good at encoding as the Prescott.
I'd say that CPUs are going to drift back more towards being general purpose chips (ie: more P3, less P4), and GPUs will become much more general DSPs (which the most recent ones are, to a limited extent). In terms of pure FPU grunt, a P4 or Athlon gets absolutely crushed by the latest GPUs. We're talking an order of magnitude here (X800 XT does ~200 gflops peak IIRC, compared to a 3.6GHz P4's paltry 15 gflops), but unfortunately it just can't be harnessed in todays cards (for both hardware and political issues).