View Full Version : TMU explanation
iceman2g
07-15-03, 11:43 PM
Can you someone explain the difference between 8X1 and 4X2. I'm asking because my 8500 has 4 pipes with 2 TMU per pipe and I see the 9700Pro has 8 pipes with 1 TMU per pipe. Wouldn't that mean they render the same number of pixels if both cards were clocked the same. Somebody help me clear this up as i'm not that great with video cards.
Thanks in advance
BUMP - I'd like an answer to this aswell.
Albuquerque
07-17-03, 05:59 PM
Well, some clarification first: your 8500 has three pipelines, not four. But it does have two texture stages per pipeline, for a total effective output of 3x2. The GF4 series has a pair of pipelines at two texture stages each, giving 2x2. The R300 core, as you already mentioned, is built on an 8x1 setup.
Basically, it means the following:
Your card theoretically has the potential to have more single texture fillrate (more pipelines) than a GF4, and also theoretically has the potential to have more multitexture fillrate than a GF4.
However, the R300 core will win by a landslide on single texture fillrate simply because of the more than double number of pixel pipelines (three versus eight). In multitexture, your card can only (at a theoretical maximum) put out six textures per clock, whereas the R300 could theoretically put out eight textures per clock. The problem is, the R300 cores do multitexture by way of an "internal loopback", meaning the output of one pipeline is looped back around and reworked for multiple textures. So it actually turns out that you don't get a true eight textures per clock.
When it all comes down to it, clock-for-clock the 3x2 method is a bit faster for mulititexture applications. However, the 8x1 method is a lot faster and singletexture / raw fillrate applications. ALSO KEEP IN MIND, pixel shaders are done on a per-pixel single pass basis, which means an 8x1 setup has a LOT more pixel output power when doing pixel shaders.
This is likely at least half the reason why the R300 and R350 cores are whipping the FX series of cards in pixel shader performance -- the FX's are built on basically a 2x2 or 4x2 setup, versus a straight 8x1 setup. This essentially cripples the FX's pixel shader processing power by a factor of two (when compared clock-for-clock) against the Radeon 9500P cards and above.
Originally posted by Albuquerque
Well, some clarification first: your 8500 has three pipelines, not four. But it does have two texture stages per pipeline, for a total effective output of 3x2. The GF4 series has a pair of pipelines at two texture stages each, giving 2x2. The R300 core, as you already mentioned, is built on an 8x1 setup.
Basically, it means the following:
Your card theoretically has the potential to have more single texture fillrate (more pipelines) than a GF4, and also theoretically has the potential to have more multitexture fillrate than a GF4.
However, the R300 core will win by a landslide on single texture fillrate simply because of the more than double number of pixel pipelines (three versus eight). In multitexture, your card can only (at a theoretical maximum) put out six textures per clock, whereas the R300 could theoretically put out eight textures per clock. The problem is, the R300 cores do multitexture by way of an "internal loopback", meaning the output of one pipeline is looped back around and reworked for multiple textures. So it actually turns out that you don't get a true eight textures per clock.
When it all comes down to it, clock-for-clock the 3x2 method is a bit faster for mulititexture applications. However, the 8x1 method is a lot faster and singletexture / raw fillrate applications. ALSO KEEP IN MIND, pixel shaders are done on a per-pixel single pass basis, which means an 8x1 setup has a LOT more pixel output power when doing pixel shaders.
This is likely at least half the reason why the R300 and R350 cores are whipping the FX series of cards in pixel shader performance -- the FX's are built on basically a 2x2 or 4x2 setup, versus a straight 8x1 setup. This essentially cripples the FX's pixel shader processing power by a factor of two (when compared clock-for-clock) against the Radeon 9500P cards and above.
Nice explanation! Thanks
Dragonprince
07-17-03, 07:06 PM
Originally posted by Albuquerque
Well, some clarification first....
Excellent post.:)
iceman2g
07-17-03, 10:52 PM
Great explanation and according to this article (http://www.anandtech.com/video/showdoc.html?i=1544) it has 4 pipelines. I checked it out on my card and it says 4.
vBulletin® v3.8.7, Copyright ©2000-2012, vBulletin Solutions, Inc.