You can't base your conclusion of an architecture with the results of one benchmark. I could do the same with WinRAR and say that an Opteron kicks Xeon butt needing only 2ghz to give the same performance as intel's flagship quad core at 2.93ghz.
It's not that it's poorly implemented, that's just the reality of L3 cache, it's a trade-off. Yes you will increase your latency when going to main memory, but most of the time you should be able to find much of the info in L3 cache, so you don't need to go to main memory thereby increasing performance. In some situations it will help significantly, in others, it may hurt a little. Obviously AMD felt that it's testing showed the L3 cache tends to help more than it hurts.