New Phenom 2's X4 and X2

Niku-Sama · Mar 3, 2009

KTE said:
...
Every A64/K8 based rig at my office department and at the appartment I definitely had planned to upgrade with Callisto.. if it came at lower MHz. Office/home don't require such speed parts. High MHz means higher power/cost, roughly at the lower end of the Heka spectrum. I'd be immediately interested if they shipped some 2.5-2.7GHz Callisto models but that seems unlikely. I hope they keep the pricing structure similar to what they did with Agena based models, which would make Callisto a very clever buy.
....

that is true but i think a new one of these isnt going to cost the same as a new windsor they'll probally be $50-$60 range, look at the 65nm 7's now, they havent been out that long but theyre already arround $60

i'm jumping on one of these Phenom X2's when they come out, weither or not i'll be able to use it or not is another story as i am moving soon.....without my computer

CharlieCS · Mar 3, 2009

considering stock GHz i thing they will be in 90 dollar range plus minus 10 $ , but i will be wary untill i see voltages , i have learned my lesson with neutered quad

Rick James · Mar 3, 2009

Kuroimaho said:
HKepc has info on 3 new PH2 processors with launch dates, they usually base these on information what the channel receives so they used to be reliable. Link

Can't wait to see the pricing, the X2s could be interesting with so much cache, how these compare to C2Ds.

There's a big difference between having a lot of L3 cache compared to L2. Why doesn't amd just give the X2's 8 megs of L2 cache? The architecture of the C2D is light years above the X2's but with all that L2 cache it would be a lot closer.

ChanceCoats123 · Mar 3, 2009

I think this was mentioned in a previous post, but since L1 is so fast it is expensive and L2 not as fast is still more expensive than L3 which despite being slowest still can be put in larger amounts.

terran2k · Mar 3, 2009

Rick James said:
There's a big difference between having a lot of L3 cache compared to L2. Why doesn't amd just give the X2's 8 megs of L2 cache? The architecture of the C2D is light years above the X2's but with all that L2 cache it would be a lot closer.

I always hear it the otherway around, the AMD architecture has more potential, it's just never been fully reached, although they are getting their act together now. I don't think AMD has the capability to input that much L2 cache without affecting their yields, that's my best guess. Maybe that'll change when they get the foundry up and running?

CharlieCS · Mar 3, 2009

wasn't L2 cache on C2d to compensate for FSB bottleneck ? and that HT architecture wasnt benefiting from L2 cache , rather it needed L3 cache for better multi core operation and better multi core comunication ...correct me if am wrong am not to big on subject

DragoXT · Mar 3, 2009

The speed of the cache is negligible depending on how much associativity there is. With Intel their associativity is crap compared to AMD and they have tons of L2 Cache, while AMD has much less cache but it is much more associative. Think of associativity as interconnects on a network. You have your compy hooked up to the LAN by connecting to a switch. Now in doing that you force the switch to know where to put the packets sent to get to the right sender. Now what if you had a fully interconnected network where you had cables running to each and every computer. This causes little over head and you can get data where you want it very quickly. AMD's cache setup is like the fully interconnected network, more associativity, while Intel's is less associativity.

Each processor architecture is different, so they do things differently yet achieve similar results. Intel has massive amounts of L2 cache cause it needs it for its architecture to work best, while AMD chips didnt need that much to be competitive due to how their architecture is set up. AMD isnt forcing massive L2 cache cause it doesnt need to. Heck look at the Core I7, it only has 256k of L2 cache per core, less than AMD's chips. Each architecture has ways to make it work best, and the engineers spend their time figuring it out, and if upping the L2 cache doesnt net enough performance benefit, then why waste the money on it when L3 does just as good a job for less money. With Intel's core architecture they found that more L2 really helped and made them shine so they added it cause it gave such a good performance boost.

Niku-Sama · Mar 3, 2009

Rick James said:
There's a big difference between having a lot of L3 cache compared to L2. Why doesn't amd just give the X2's 8 megs of L2 cache? The architecture of the C2D is light years above the X2's but with all that L2 cache it would be a lot closer.

i wouldnt call it light years but it is ahead, but i think the majority reason is there hasnt been a super awsome chipset but theyre getting better

Zurvan · Mar 4, 2009

Rick James said:
The architecture of the C2D is light years above the X2's but with all that L2 cache it would be a lot closer.

I disagree. Architecture wise, K10 is superior (as in 'more advanced') compared to Penryn.
However,the performance of the actual product is also influeced by a lot of factors. K10 has had its share of flaws. K10.5 is a step in the right direction. With further revisions down the year, I think K10.5 should be able to beat Penryn is actual products lineup. Whether or not that is comercially suitable is another story though.
L3 replacing L2 as the 'common CPU cache' is a step forward. Why do you want to get K8 back again ?

Kuroimaho · Mar 4, 2009

Rick James said:
Why doesn't amd just give the X2's 8 megs of L2 cache? The architecture of the C2D is light years above the X2's but with all that L2 cache it would be a lot closer.

I think he is just trolling but every once in a while comes a post like this which makes me giggle even months later when I remember it. Last time someone said AMD makes like 10 type of processors Opteron and Phenoms and... that post entertained me as much as this.

I have a feeling you have no idea what this thread is about. By the way even the K8 was architectually superior to the C2D but this is about K10.5 which you seem to be unaware of judging by your L2 related question.

Why didn't Intel give the I7 8MB of L2 cache ? Is the I7 also inferior to the C2D because of the less L2 ?

Rick James · Mar 4, 2009

Kuroimaho said:
I think he is just trolling but every once in a while comes a post like this which makes me giggle even months later when I remember it. Last time someone said AMD makes like 10 type of processors Opteron and Phenoms and... that post entertained me as much as this.

I have a feeling you have no idea what this thread is about. By the way even the K8 was architectually superior to the C2D but this is about K10.5 which you seem to be unaware of judging by your L2 related question.

Why didn't Intel give the I7 8MB of L2 cache ? Is the I7 also inferior to the C2D because of the less L2 ?

No i'm pretty much lost in the tech specs which is why i come into these threads.

AlabamaCajun · Mar 4, 2009

You have to understand what cache is for what. (Rick cover your ~~ears~~ eyes)

.
L1 is code and data in separate segments for the near core access.
L2 is code and date the cpu is about to use are has just been in use but may be needed again. (This gets real technical from here but I'll spare it).
L3 is now added as the buffer to ram and paging between cores. Buffering is just a way to move blocks fast while other operations continue. It's there when needed. As for the paging/sharing part, that is where cores use data and pass it around. In some case code is loaded there and remains when something is used often or on several cores.

As for how much cache is use depends on how a formula was devised to determine usage.

Continued 3 hours later!
As said a few posts back, L1 is expensive and power intense. I think they figured 64K+64K covered most needs as intel was running 16 and 32 for a long time. This just depends on aspect of the programs. What is needed here and now amounts to 16K-64K which is inside the low order address space. (This one is for you Archer).

L2 is where the most swapping occurs. First shot to the L1 and close by. In most cases this is the staging area to get code and data clode to the cores.
I think 1meg per core would improve some things but the average code block or data chunk is in the 16K, 64K and 256K range. A lot of this occurs because so much was built on the 16bit and 32bit systems that these rules still apply. Some large data access programs may have an avantage with more L2. Some games would be included here but 1 meg still would only be a drop in the bucket anyway. It comes down to trade offs.

L3 for the servers and buiness apps get more here from the sharing aspect with the multi-threaded apps.
L3 for gamers is mostly a buffer situation but greatly improves some games by keeping some code and data close at hand.

Both companies see this clearly and this is why cache sizes are what they are. Costs & real estate vs app needs drive what gets on die. Shrinking the die gets more space and costs less so this is why we see the larger caches. Power for the core is less leaving power available for the larger caches. Shanghai/Deneb are suppose to be able to shut down sections of cache to save power but I don't know if that spec made it to production.

I see 2M L3 doing a lot for most of what we run here. 4-8M does help many games and other apps.

Just looking at the Tri and Dual cores with the full 6M of L3 make a lot of sense. We use to have 64k to 4M on many of our first "PCs". Some programs may still run inside these specs but use more data. It's just a good size cache like having on die ram.

Archer0915 · Mar 4, 2009

AlabamaCajun said:
You have to understand what cache is for what. (Rick cover your ~~ears~~ eyes) .
L1 is code and data in separate segments for the near core access.
L2 is code and date the cpu is about to use are has just been in use but may be needed again. (This gets real technical from here but I'll spare it).
L3 is now added as the buffer to ram and paging between cores. Buffering is just a way to move blocks fast while other operations continue. It's there when needed. As for the paging/sharing part, that is where cores use data and pass it around. In some case code is loaded there and remains when something is used often or on several cores.

As for how much cache is use depends on how a formula was devised to determine usage.

More later, I'll have to pick this up in a bit.

Bama I am disappointed, I think you should put it all out and what does not get absorbed can then be spoon fed so if you are going to cover this I want it all there from fetch order to read/write through to addressing or I wont be happy

lol

AlabamaCajun · Mar 4, 2009

Archer0915 said:
Bama I am disappointed, I think you should put it all out and what does not get absorbed can then be spoon fed so if you are going to cover this I want it all there from fetch order to read/write through to addressing or I wont be happy lol

Took care of you bud, check update in post 32 :thup:

Archer0915 · Mar 4, 2009

Good overview I have issues dealing with (giving) explanations and I am glad you did it I think I get a little to technical where it is not necessary or wanted for that matter. That being said there are some great books and technical bulletins as well as a lot of white papers out there if anyone is really interested (Rick

)

Zurvan · Mar 4, 2009

Rick James , I suggest you please read this. Its just a single page and will really help you know why small L2 large L3 and not large L2.

Archer0915 said:
I think I get a little to technical where it is not necessary or wanted for that matter.

Never refrain, you have plenty of audience :beer:

AlabamaCajun · Mar 4, 2009

No Rick, don't do it, it's propaganda j/k

ROFL but please warn us when the link is from ~~Intel~~ A-Tech

Found one in the horses mouth http://forums.amd.com/devblog/blogpost.cfm?catid=271&threadid=103010

Zurvan · Mar 5, 2009

AlabamaCajun said:
ROFL but please warn us when the link is from ~~Intel~~ A-Tech

Virus ?

TY for the link btw

squads · Mar 5, 2009

Zurvan said:
Virus ?

TY for the link btw

Anandtech is infamous for their Intel bias. That is what he was referring to.

KTE · Mar 5, 2009

Niku-Sama said:
that is true but i think a new one of these isnt going to cost the same as a new windsor they'll probally be $50-$60 range, look at the 65nm 7's now, they havent been out that long but theyre already arround $60

There's two main factors swaying my assumption on price.

#1 Deneb core is far more competitive than Agena core was which pushed AMD Agena into very low-end bargain bin right upon introduction... naturally that meant the lowest-end Agena's being harvested as Kuma had to fit in the overall marketing hierarchy and compete with their rivals, which limited them to just above K8 65nm prices. Remember, Kuma came very late, nearly a year after Agena did and by then, Intel had amassed a huge lead with Wolfdale and its harvested derivatives. Any higher pricing and Kuma wouldn't compete nor sell at such time. The price was strictly governed by the market competition.
Deneb and all the recouped CPU's using its cores don't have such a forced shove into such low pricing sub $80.. well, no way as extreme anyway which allows AMD to sell them for a higher price with higher volume due to being competitive higher up (key to marketing).

#2 Heka is currently spanning the $120-150 territory. That means the next pricing territory lower down will be occupied by Callisto, that being $110 and below currently, but equal or above the Kuma 7750BE pricing. Callisto however won't arrive for a month or two yet at the earliest, and by then, pricing is bound to drop another round at least. Bearing all this in mind, Callisto will likely come from $75-$110. Regor, I expect will come below this in price as it will be far cheaper to manufacture and make profits from. Callisto with huge L3 cache per core at its disposable will show quite promising performance versus its competition but lose out on power requirement.

AlabamaCajun said:
Power for the core is less leaving power available for the larger caches. Shanghai/Deneb are suppose to be able to shut down sections of cache to save power but I don't know if that spec made it to production.

One of the largest benefits for the L3 cache is precisely this ability. To detect core activity levels and flush the Core and L1/L2 contents into the L3 to shut down the rest of the unused logic. It allows extremely low idle power and thats what you'll find if you tap into each Core phase for a Deneb in low-power mode (especially in C1E).

Another is also to provide the ability to a core to probe the private shared cache of any core and as the AMD developer explained, for local semi-inclusive and exclusive data buffering as per requirement.

My main push for AMD/Intel is to decrease cache latencies first and foremost as that can make a very large performance difference per time. The lower storage hierarchies are all starved and their benefits and usage minimized due to the high latencies of the upper 'buffers'. I expect thats when AMD will start to see large benefits from the huge L3 size they've employed... as soon as the L3 latency is dropped by another 1/4, which is the sweet spot.

They also need to improve buffering local to caches and the available prefetch abilities to each cache, as well as the width between cache and the transfer bandwidth between the MCT/DCT and the XBar/L3. Most of the subroutines optimized for higher L1 associativity will also suffer performance penalties with the Deneb core.

New Phenom 2's X4 and X2

Member

Member

Disabled

d20 in a jacket

Member

Member

Member

Member

Member

Member

Disabled

Member

"The Expert"

Member

"The Expert"

Member

Member

Member

Member

Member

Similar threads