The P4 has 1 complex, 32-bit ALU which runs at the core clock and 2 skewed, double-pumped 16-bit ALU's for simple ALU operations. With the move to double-pumped 32-bit ALU's, simple ALU operations on large integers no longer take the full clock, but rather, only half the clock. This would improve integer performance (for simple operations) significantly. I'm not sure whether the complex ALU is going to be double-pumped or not though. Another improvement that wasn't mentioned was the double-pumped L1 data cache and possibly trace cache (in that it runs at twice the clockspeed).
And of course, there's the increase L2 cache, the increased L1 cache and possibly trace cache, an improved version of hyperthreading, and the increase in FSB, and let's not forget the improvements in clockspeed.
I don't know where this 3.2 GHz came from though. As far as I've heard, Intel has not made any statements about what frequency Prescott would be introduced at. However, it seems feasible to say that the .13 micron Northwood has a lot more life than just 3.2 GHz. It also seems implausible that Intel would wait an entire quarter before releasing a faster grade processor.