After a bit more thought, I'd say that the crossbar does not operate at the hypertransport frequency. Why? There have been several articles where people have investigated the effect of changing the hypertransport multipliers. See
http://www.neoseeker.com/Articles/Hardware/Guides/athlon64oc/2.html
for example. None of these show any significant change in Sandra memory benchmarks, going from 1x to 5x. This is not what would be expected if the RAM bandwidth was being cut by a factor of 5. This doesn't rule out the possibility of it running at a constant multiple of the system clock, which would cause it to have a restricted bandwidth on each port.
There's two experiments that could shed light on this. The first would require an dual-Opteron system and a bit of research to see how XP NUMA support works. Essentially, it'd be a RAM bandwidth test that would run on a single CPU, but hit RAM from both CPUs. If the CPU port was restricted to 6.4GBytes/sec, this would be obvious in the bandwidth readings, which should be around 10GBytes/sec (~5 from local ram and ~5 from the remote RAM over hypertransport).
The second test could run on an X2 or a dual-Opteron. This would work by having two threads, one for CPU0 and one for CPU1. The first thread would generate, say, 512KB of data. Obviously, this would reside in the cache of CPU0. Once the first thread had generated the data, the second thread would start, and read in both the data from the first thread (hence requiring it to be transferred over the hypertransport bus) and also data from main memory. Again, this would show up as a reduced overall bandwidth if the CPU or crossbar bandwidth is constrained.
A variant of this test could be interesting as well. Essentially the same, except CPU1 just reads the data from CPU0. Essentially testing the bandwidth between the two CPUs. If the crossbar operated at a frequency higher than the hypertransport link and the X2 operates as AMD claims (where the second CPU is directly bolted onto the crossbar, not going through a hypertransport link), the X2 whould have more CPU<->CPU bandwidth than the dual Opteron system.
Back to the topic, where does that leave us? Well, there's two options:
1) Memory controller <-> CPU bandwidth limited in some way.
2) Higher than 1:1 ratios are not implemented properly.
I personally find option 1 much more likely. Since all AMD have done is add an extra few columns to a divider table, there should be no difference in operation of the memory controller between 200 and 250Mhz settings. All that is different is the divider. But if option 1 is the case, AMD has done some very strange things. For example, what would be the point of the 5x hypertransport multiplier, and the higher than 1:1 memory ratios? Clearly, something is bottlenecking below 6.4GBytes/sec, which would suggest a crossbar multiplier and bus width that would limit to 256 bits per system clock cycle (eg: 8x at 32 bits wide, 4x at 64 bits wide, etc).
Back to the orage text, this is referring to the early Opterons. These chips had a memory bus limit of 166MHz (PC2700), and reported this as such in the DramFreq field. All that the orage text is saying is that DramFreq is indicitive, not restrictive. You can tell these early Opterons to use a 200MHz RAM frequency, even if they tell you that they can't go above 166MHz. Thay just probably won't work correctly