• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

Hello. I am stupid. Cache question =)

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

Axle

Member
Joined
Aug 27, 2002
Location
IE, CA
So what does that bologna cache do anyway? I searched and I searched, and--believe it or not--couldn't find an answer!

I ask because I'm thinking about jumping from my 1700+ DUT3C AIRGA (actually I don't remember the exact stepping, but it used to be hot stuff! Ran it at 2.2 for a while.) to a Barton with 512KBs of the stuff. Now, I personally don't notice my Dad's Duron to run any slower than my XP, but I do notice my brother's Barton 2500 to run right snappy. Could just be he has really expensive stuff, though.

My question: in regular Mandrake/XP operation, am I going to notice any sort of speedyness by going Barton 2500? And in games (namely UT2K4), will those be any faster?

If I do make a jump, it'll be on a mobile proc; looking for that low heat & quiet alternative. I really can't OC anymore; I had (have) really nice passive water, but need some consolidation--fitting inside a closed case would be swell.

Anyway, sorry for the verbiage--

tia!,
lex
 
As you go farther away from the processor to find data, the slower it is accessible. What L1 and L2 caches do is decrease the amount of times that the processor has to access memory to find and execute the next instruction word.

Caches use the properties of spatial locality--for instance, if one instruction at say 0x00AA0 is executed, it would be a pretty good bet that the instruction at say 0x00AA4 will be next in the pipes, no? The cache fetches data around the current instruction pointer and keeps it ready to stuff into the CPU.

Your games might be faster with the 2500, if you tweak it beyond stock.

Anything else?
 
Ah! Thank you sir. So I guess I'd be better off saving my money?
 
totally agree with cpt newbie.

He sort of explain theorically, and now for pratice i did test them both. Barton core and TBredB, the performance hi is not much at all for TbredB 1-2% with noremal operation use even at file compression and decompression and audio encoding/decoding.

Since AMD AthlonXP has quite a few pipelines compare to Intel P4, cache doesnt help much as in P4.

Yes save the money for 939, then your bro's setup will have a hard time to catch up ;)
 
Captain Newbie is correct.

AFAIK in practice, 256kb l2 cache has about a 95/5 hit/miss chance. the 512kb has about 97/3. I don't remember where I read this information, so please correct me if I'm wrong (and I probably am).

The barton core is likely to have other improvements over the thoroughbred anyway, such as architecture. My 2800 @ 2.4GHz goes mighty snappily :). Definately see the improvement over a 1.9GHz T-Bred.
 
Ah, thank you all!

Though I can't say I'm totally clear on it (how can you say a 2500 will run no faster, but still run faster!! Gah!! :D), I don't have too much money to spend on a new gizmo anymore.

939, eh? That's a lot of pins. Anyway, thanks all.
 
Uh axle, you actually got a tbred A to do 2.2g? Or is it 2200+? I'd like to know what mobo you have. I have a xp1700 tbred A AIRGA and I'd like to know if I can do it too. Right now I'm running at 2000+ (1.67g) which is max for my mobo. Thx.
 
Cache and CPU performance

There are two processors A and B both running at 2.5 GHz, i.e. 2,500,000,000 clock cycles per sec. A basic CPU operation requires one clock cycle.

One processor A has a larger L2 cache, say 512 KB. Another processor B has a smaller L2 cache, say 256 KB.

L1 and L2 cache are for storing frequently used data for the CPU, temporarily until new data has to be swapped in from, and old data has to be swapped out to main memory. The processors can read from and write to the cache with very few clock cycles (cache latency).

Main memory (aka L3 in PC) can store much much more amount of data (e.g. 1 GB main memory would be 2000 times of 512 KB L2). To read/write the main memory, it requires much much more CPU cycle, say 30 - 80 times.

Hard drive (aka L4 in PC) can store even more data, ..., basically the universe of the data in your system, but it takes even more time, and it occurs during paging when data is not found in main memory in a computer system.

L1 cache, L2 cache, main memory (L3), hard disk (L4) form the so called memory hierarchy.

The larger the cache, the chance (probability) of finding data there is higher. Ananlysis shows that when the cache size is above certain size for a given CPU architecture, CPI and cache latency, the probability will level off. Typically, the probability is around 85 - 95% for L2 ranging from 256 KB to 512 KB or even 1 MB.

The time to read/write data to the main memory typically requires many many more CPU cycles (see earlier number). So if the CPU needs data that is not in the cache (called cache miss), it would have to wait until the data arrives in the cache again from the main memory (many more cycles later than if it is found in the cache).

Even if both CPU A and B are running at the same frequency of 2.5 GHz, CPU A will finish a given job sooner than CPU B since the probability for CPU A to find data in the cache is higher than that of CPU B. CPU A has less cache miss than CPU B.

Analysis has shown that, by doubling the L2 cache size, the overall performance would be improved by 0 - 10%+ over a wide range of applications, some more and some less, averaged typically by say 5%.

That is why we usually say a Barton (512 KB L2) performs 5% better than a 1700+ (256 KB L2) running at same frequency, or the 1700+ has to run 125 MHz faster to break even with a Barton at 2.5 GHz. Few months ago, a Tbred B DLT3C 1700+/1800+ overclock about 100 MHz better than a desktop Barton, so they were about tie. But recently the mobile Barton overclocks equally good, and in many time even higher than the 1700+/1800+, so the mobile Barton is a better choice for performance (apart from the price difference).


What happens to programs running in CPU with smaller and bigger L2 cache (page 17)

Some remarks on cache latency, cache size, memory latecny and memory bandwidth (for A64's) (page 19)
 
Last edited:
Hey, I understand quite a bit better now, hitech, thanks. 5-10% isn't worth 5-10$ to me, let along 50-$100. You broke down that flummoxing issue right nicely, I must say! :)

wju: no, sadly, it took some digging and my chip is a JIUHB 1700+, on a NF7-S v1.2 & 2x256 Buffalo BH-5 3200 (bought these sticks at $39/per back before anyone had ever though of CH-5!). I did own a AIRGA, but it was a 1.5 volter DLT. Speaking of which, I could really go for a BLT.

Think about this: with a 1700+ DUT3C JIUHB on chilled water (idle at 29c, load= 33c or so), with an Abit NF7-S and some really pretty desirable DDR3200 BH-5 ram, the very best I ever did was this:

3k.jpg


I'll say it: I just suck at overclocking!
 
Axle said:
Hey, I understand quite a bit better now, hitech, thanks. 5-10% isn't worth 5-10$ to me, let along 50-$100. You broke down that flummoxing issue right nicely, I must say! :)

wju: no, sadly, it took some digging and my chip is a JIUHB 1700+, on a NF7-S v1.2 & 2x256 Buffalo BH-5 3200 (bought these sticks at $39/per back before anyone had ever though of CH-5!). I did own a AIRGA, but it was a 1.5 volter DLT. Speaking of which, I could really go for a BLT.

Think about this: with a 1700+ DUT3C JIUHB on chilled water (idle at 29c, load= 33c or so), with an Abit NF7-S and some really pretty desirable DDR3200 BH-5 ram, the very best I ever did was this:

...

I'll say it: I just suck at overclocking!

If I understand your setup correctly, with a rev 1.2 NFORCE2 board and a Tred A, your result is quite good, ....

Most ppl could not get to 200 MHz FSB using rev 1 motherboard, and Tbred A max at 1.8-1.9 GHz on air. You are getting almost 2.2 GHz with chilled water.

Had it be a mobile Barton, you would most likely be getting 2.7-2.8 GHz.

For rev 1 nforce2 motherboard, it may need a Vdd mod to get higher voltage for the chipset to get to higher FSB (around 210 MHz), not worth doing it now for risk/reward, IMO.
 
The barton core is likely to have other improvements over the thoroughbred anyway, such as architecture.

You'd be surprised by how little difference there is besides the enlarged size of the cache tacked on. A little slight different placement of some core things to make it fit into a recatangle better probably.
 
Back