• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

What AMD need to put into the K8L for definite :)

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

fordsierra4x4

Member
Joined
Jan 30, 2005
Location
Hove, Sussex, UK
Just a quick kinda techie rundown - but i've been looking at the HKEPC data that compares the C***** and the FX-62, and its obvious what AMD needs in the K8L to remain competitive.

First off, AMD currently has a 12 stage pipeline - C***** has a 12 to 14 stage - we cant be sure, but it looks like AMD dont need to change that at any rate. Maybe another 2 stages would give it a bit of a clockspeed boost and other architectural changes would make up for the increased length.

AMD has the upper hand in L1 cache size, although it needs to be waaay more associative - 2way vs 8way isnt much of a fair fight - even 4-way would be better!

C***** has an advantage as far as the L1 TLB's are concerned - the FX-62 has a combined total of 64 entries, Intel's offering has 384 :(

The C***** has a slightly wider branch predictor compared to the FX-62, although i suspect AMD will be altering that in anycase, its only short by 4 bytes/cycle :)

Now we come to the big stuff, the proper internals.

First, the Load/Store units need a little improvement - namely AMD need to add one.

Secondly, the FPU and the SSE units on the FX-62 need to be improved. The C***** has a more complex FPU (from the looks of things), but SSE performance is absolutely gargantuan. Two 64-bit wide SSE units on the FX-62 just cannot compete with the three 128bit wide SSE units on the C*****.

Aside from those, the K8 architecture doesnt need to change anywhere near as radically as Intel has had to with the transition from Netburst to Core. The improvements that K8L will bring are still vague and unconfirmed, although if someone could take the data that i've used here:

http://translate.google.com/transla...&hl=en&ie=UTF-8&oe=UTF-8&prev=/language_tools

and add another column for the K8L specs (albeit speculative) - it could be interesting :beer:
 
Rattle said:
and until all that happens... I wont be using AMD again, not to mention the sucky ptices on a 2x1mb
cache chip

Although 2x1mb gives very little performance benefit, and lower cost, easier to produce AM2's would be more beneficial for AMD at the moment, as well as the consumer. But whatever, you've already made up your mind to go C*****, no need to keep repeating it. :shrug:
 
fordsierra4x4 said:
Although 2x1mb gives very little performance benefit, and lower cost, easier to produce AM2's would be more beneficial for AMD at the moment, as well as the consumer. But whatever, you've already made up your mind to go C*****, no need to keep repeating it. :shrug:


I dont see how less cache is beneficial for the consumer, I dont want to pay $200 for a cheap chip thats better for amd's pockets when I can buy one from someone else for same price thats much better.

When you've played with many chips with 512 and 1024 cache as I have and overclocked as many, single and duals you wont be saying half the cahce isnt a big deal because its roughly equal to 200mhz. My 165 stock performed identical to my x2 3800 at stock, 165 being 200mhz slower and having 2x the cache, when you OC them the cache comes in to play even more.
 
fordsierra4x4.. thanks for the arch. diffences rundown.

http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2768&p=3

anandtech said:
At a lower level, we have a block diagram of the compute core for K8L CPUs. Again, this diagram is a bit oversimplified, but we can see a few key features of the architecture. On the FP side, the CPU is able to handle 2x128-bit floating point or SSE operations per clock. While this isn't quite as flexible as Intel's Core with its 3 SSE units, AMD's K8L will be able to handle 4 double precision floating point operations per clock. . (Current K8 chips can only do 1x128/2x64-bit SSE instructions per clock.)

Google says FPU units will be doubled with K8L

http://www.realworldtech.com/page.cfm?ArticleID=RWT060206035626

The load/store units also have somewhat more flexible execution; they can re-order loads with respect to other loads (although loads cannot move around stores)


that last artical seems to have some newer info. and alot of what your looking for seems to be pretty broken up across quite a few pages online.
 
greenmaji said:
fordsierra4x4.. thanks for the arch. diffences rundown.

http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2768&p=3



Google says FPU units will be doubled with K8L

http://www.realworldtech.com/page.cfm?ArticleID=RWT060206035626




that last artical seems to have some newer info. and alot of what your looking for seems to be pretty broken up across quite a few pages online.

Hmm - cheers for that URL - looks like the K8L will have a 32byte per cycle branch predictor - 12 up on the Conroe....as for the FPU & SSE units, well widening the SSE units from 64-bit to 128-bit is certainly useful and the 4DP FLOPS ....well, is that really all that useful for most people?

Would be handy to find some info on the L1 associativity - 2-way is awfully weak. Good to see the L1 cache remains at 128kb though :)
 
Rattle said:
I dont see how less cache is beneficial for the consumer, I dont want to pay $200 for a cheap chip thats better for amd's pockets when I can buy one from someone else for same price thats much better.

When you've played with many chips with 512 and 1024 cache as I have and overclocked as many, single and duals you wont be saying half the cahce isnt a big deal because its roughly equal to 200mhz. My 165 stock performed identical to my x2 3800 at stock, 165 being 200mhz slower and having 2x the cache, when you OC them the cache comes in to play even more.

I dont know how many chips you've played with, but i've played with plenty - and the difference between my Athlon 64 3000+ @ 2250Mhz and my Opteron 146 at the same 2250Mhz was about 2% - hardly noticeable for most people. It only makes a difference if you want it to. Tangible, real world differences aren't going to happen with an extra amount of L2 cache.

If benchmarking is all you do - then sure, 1mb vs 512kb L2 will make a difference, but for 90% of everyone else, it wont make any difference to them in terms of being able to play a game that was previously unplayable or being able to encode stuff to xvid any faster.

It's because of the Athlon64's IMC that L2 cache doesnt need to play as big a part as it does in Intels case. As someone else pointed out - L1 is where its at, and thats where the CPU primarily works from. AMD's 128kb (64 + 64) design is simply better than Intels, although it could do with better associativity.

Ask yourself this question - which is better - a cheap, 1mb (512kb per core) L2 duallie, or an expensive 2mb (1mb per core)? The differences at the same clockspeeds are going to be minute. Which makes better financial sense? The 1mb, 512kb L2 per core CPU's. Less die space means more on a wafer, more on a wafer leads quite logically to cheaper production costs - and THAT can make a difference.

Oh and this isnt a space for arguing about 512kb vs 1mb L2 or grumbling about the current Athlon64 architecture - its about the improvements that AMD are going to make with regards to the K8L.
 
we'll see when you can buy a 2,4ghz conroe with 2x1mb cache for $200 which you'd wish you bought, when all you can get from am2 is a 2,0 chip with 512x2 for $200
 
Rattle said:
we'll see when you can buy a 2,4ghz conroe with 2x1mb cache for $200 which you'd wish you bought, when all you can get from am2 is a 2,0 chip with 512x2 for $200

if you don't have anything productive to this thread to post, don't post at all. If you continue this way I'll give you a couple days to think about it
 
Rattle said:
we'll see when you can buy a 2,4ghz conroe with 2x1mb cache for $200 which you'd wish you bought, when all you can get from am2 is a 2,0 chip with 512x2 for $200

What part of 'this is not a conroe vs AMD thread' dont you get? I've already explained to you the differences between Athlon64 and Conroe with regards to L2 cache. The current P4 D's come with 4mb L2 cache - does that make a huge difference vs the X2's? nope. Your arguments are juvenile and not grounded in any kind of technical fact - your argument is simply that larger = better. Which is crap.

To re-iterate - this is not a Conroe vs AMD thread - we're discussing the K8L featureset compared to the current K8 architecture. Now go away and troll another thread.
 
fordsierra4x4 said:
its about the improvements that AMD are going to make with regards to the K8L.

It would also be nice to know how K8L is going to stack up features and architecutaly compared C**** (just following your lead :p )
Well you did K8 vs C**** so this is a given already :D

Belive it or not it would give me a better idea how K8L will perform :eek: ;)

Im a fan of performance see ;)
 
Like HKEPC have done - a table comparing exactly what comprises both the C***** and the K*L would be useful, with the current K8 as a baseline.

Problem is finding out specific details beyond whats already been published :(
 
no doubt, AMD has a whole lot more to work with than Intel did. they got themselves a great start with a64's. the next year and a half should prove very interesting.

any more info available on the supposed "reverse hyperthreading"?
 
Rattle said:
I dont see how less cache is beneficial for the consumer, I dont want to pay $200 for a cheap chip thats better for amd's pockets when I can buy one from someone else for same price thats much better.

That isn't completely fair, first off, a venice and a san diego clocked at the same speeds differ very little in performance. Doubling the L2 cache on K8 doesn't seem to do a whole lot except for boosting the very small writes in memory benchmarks. Conroe is a whole lot better than what AMD currently have to offer, but AMD have begun to cut prices significantly.

hUMANbEATbOX said:
any more info available on the supposed "reverse hyperthreading"?

exactly what i want to know.
 
hUMANbEATbOX said:
any more info available on the supposed "reverse hyperthreading"?

I've heard that they might implement that in K10... just some rumor. IMO it's too far away to start a discussion about it. But it's a very interesting tech nonetheless.

dan
 
If they dont implement reverse HT now, then in future it will dont make any diference as, most programs/games will use 2 and more cores anyway.
 
Back