This is not entirely true, it really does come down to a matter of perspective. A 3 module, 6-core BD can execute 6 128-bit floating point operations, just like a 6-core Thuban. If this was not the case, Trinity would not be able to be anywhere near parity with Llano, core-for-core in flops, even with synthetics:
http://www.tomshardware.com/reviews/a10-5800k-a8-5600k-trinity-apu,3241.html
To explain, each module has a 256-bit FPU shared between two cores. However, if no 256-bit operation is requested, that 256-bit FPU can be split into two 128-bit FPU's, one dedicated to each core. So you have up to 8 cores with 4 FPU's but the possibility of 8 FPU operations in parallel (actually more as each 128-bit FPU can execute more than one flop at a time but that's another discussion).
AMD's perspective is that you have 4 modules with 2 128-bit FPU's each, that can be used in conjuction for a single 256-bit call or 4, 256-bit operations. So AMD sees it as 8 cores, the same as Thuban, with an added option of working like 4, 256-bit cores. It's a matter of perspective.
http://blogs.amd.com/work/2010/10/25/the-new-flex-fp/
Where it gets complicated is that because of the shared resources (not just the FPU but pretty much the entire front-end) you don't get the full performance from each core. BD has a wide pipe, but can't pump enough water to fill it. AMD's goal was something like 85% performance with shared resources per core, they did not hit this with BD. They are working on getting the shared resource performance increased and I'm sure also working with MS and linux distros to better use the new architecture.
The advantage of this design is that it saves you a lot of die space for how much performance you must trade-off, assuming you execute well. This allows for a much higher core count without too much sacrifice in lightly threaded scenarios. The other advantage is that this direction could eventually lead to a real "fusion" of cpu and iGPU where the GPU takes over a lot of your floating point operations and would be much faster than a CPU trying to execute them. Whether or not this comes about, or more importantly, whether or not AMD can execute on all of this is yet to be seen, but it is at least a bold step by AMD to try and predict the future of computing and beat intel to the punch.