Overclocking Sandbox: Tbred B DLT3C 1700+ and Beyond

hitechjb1 · Apr 21, 2003

Sandra 2003 CPU arithmetic benchmark

The D-MIP (CPU integer benchmark) IPC = MIPS / clock_freq

XP 1700+ DLT3C (256KB L2) = 9411 / 2518 = 3.74 instructions/clock

For P4-B 3.06G (512KB L2) = 8957 / 3060 = 2.93 instructions/clock

For P4-B 2.8G (512KB L2) = 8196 / 2800 = 2.93 instructions/clock

Barton 3000+ (512KB L2) = 8130 / 2160 = 3.76 instructions/clock

XP 2600+ (256KB L2) = 7829 / 2080 = 3.76 instructions/clock

In other words, AMD XP CPU does more instructions per clock than P4, a ratio around 1.28:1. But on the other hands, top end P4 (more expensive) can run at a much faster absolute clock speed, and the race goes on. To make a fair comparison, we have to look at the ratio between the clocks.

For the same CPU architecture which usually lasts over a few years, if one side can improve or get a jump in the clock rate above or below the IPC ratio number (which is pretty constant short term, say 12 month), based on circuits and silicon technology, that side will come out ahead.

The IPC between XP and P4 is pretty consistent around 1.28:1. So at any point in time, if looking only at D-MIPS (integer calculation), if we look at

1. the top clock rate CPU between P4 and XP, divide IntelClock/AMDClock, if it is larger than 1.28, Intel is ahead on D-MIPS benchmark. In the past, I think Intel to AMD clock rate is higher than 1.28 for top end CPU.

For example, there are some XP CPU that are running at 3 GHz. So P4 has to have to run at 3 GHz x 1.28 = 3.84 GHz to break even !!!

2. This can extend to the absolute frequency achieved by overclocking.

3. Or evaluate the ratio at a given price and at a given time for price performance evaluation.

E.g. now, for $60-70 CPU, AMD 1700+ can deliver 2.2 - 2.4 GHz. What can Intel P4 deliver at $60-70, do the ratio calculation.
Is there any P4 around $70 that can deliver 2.4 * 1.28 = 3.07 GHz ?

One can do similar calculation for $100 CPU, $150 CPU, ...

We know we shouldn't just based on CPU for building a system, this is just some metric to evaluate and benchmark CPU's.

sanford1 · Apr 21, 2003

Celemine1Gig said:
Congratulations! Great OC!

And the pics are now working without problems!

Yes, nice OC! I have to deal with the ambient temps here in FL and my "old" PC-65 without the blowhole on top, like the new ones, needs some work. I plan on putting a 92mm on top and making a nice clear acrylic duct on the side window to the CPU fan. Then I'll try to push it some more. I'm still playing.

sanford1 · Apr 21, 2003

Celemine1Gig said:
Congratulations! Great OC!

And the pics are now working without problems!

Yes, nice OC! I have to deal with the ambient temps here in FL and my "old" PC-65 without the blowhole on top, like the new ones, needs some work. I plan on putting a 92mm on top and making a nice clear acrylic duct on the side window to the CPU fan. Then I'll try to push it some more.

hitechjb1 · Apr 22, 2003

Tbred B 1700+ DLT3C at 2558 MHz 1.925V air

WCPUID

Sandra 2003 CPU arithmetic benchmark

Asus Probe

hitechjb1 · Apr 22, 2003

What are the differences between Palomino, Tbred A and Tbred B/Barton?

hitechjb1 (04-30-2003 04:33 PM) said:
If you can see the chip die physically, as someone already pointed out, palomino is more square shape like and the Tbred A and B are more rectangular like.

The palomino are bigger in die size 128 mm^2, compared to 80 and 84 mm^2 for Tbred A and B.

If you can see the transistors and can count them, the palomino has 37.5 million, whereas the Tbred A and B have 37.2 and 37.6 millions respectively.

If your eye is powerful enough and able to go inside the chip and see the transistors, the palomino transistors have a width of 0.18 micron and whereas the Tbred transistors have a width of 0.13 micron.

If you are able to count the number of metal wires, the palomino has 7 layers, whereas Tbred A has 8 layers and Tbred B has 9 layers.

OK, back to business. Since we probably can't see these with our naked eyes, ...

But if the chip is already covered by the HSF, and you cannot see the die shape, the stepping, ... This is the trick to find out:

Palomino has only one default Vcore of 1.75V. Tbred A and Tbred B has multiple default Vcore of 1.5 (the famous 1700+ DLT3C), 1.6 and 1.65 V depends on the PR rating of the Tbred.

You can get into the bios, set the Vcore to default.

If it says 1.75 V then it is a palomino.

If it is 1.5 or 1.6 or 1.65 V, it is a Tbred A or B.

A more interesting question is why Tbred B, in particular the Tbred B DLT3C, being rated at lower Vcore but it can run faster than other Tbred B at same voltage. The main reason I think is due to lower transistor threshold and shorter channel length. Even it is manufactured with 0.13 micron like other Tbred B, it is effectively behaving like a chip with less than 0.13 micron, if you like.

link: Why the 1700+ can run so fast at low Vcore?

OnDborder · Apr 22, 2003

What chipset cooler is that? Where did you get it?

hitechjb1 · Apr 22, 2003

OnDborder said:
What chipset cooler is that? Where did you get it?

It comes in a non-brand name package called "chipset cooler" which consists of two passive square shape heat sinks (one of them is the green color one that is put on top of the SB), and a small 40 mm fan. I got it from compusa, $10.

The NB heat sink is the stock HS that came with the A7N8X-dlx motherboard, I keep the original stock HS without reseating it. I just put a smaller 40 mm fan on top of it by screws and nuts (some thinking will figure it out how to do it).

OnDborder · Apr 23, 2003

Thanks..

hitechjb1 · Apr 24, 2003

How to get it to 2.6 GHz stable ?

Tbred B 1700+ DLT3C at 2584 MHz 1.95V air

It is now at 2584 MHz on air. It can post at 2596 MHz. Not stable.

Any suggestions to get it to above 2.6 GHz stable.

Current HSF: SK-7, TT SKII
PSU: Antec TP 430
Temp: 40/40 C idle, 50/47 C Prime95 loaded

WCPUID

Asus Probe

FLyingHamster · Apr 26, 2003

man, I could only get mine to 2.44 stable.

any suggestions as to how I could hit 2.5+? my mobo and stuff is slightly different.

check my sig for further questions..

USAPGAPRO · Apr 26, 2003

Congrats on that OC. That is insane. Cant wait for mine.

hitechjb1 · Apr 26, 2003

An update for
Relationship of clock, die temperature and Vcore (update) (page 13)

Relationship of clock, die temperature and Vcore (update)

As far as Vcore, clock and die temperatue relationship, a chip (CPU) can be modeled as a capacitor C and a resistor R in parallel driven by Vcore. C models the useful active power to substain the computation by charging and discharging 100 millions of internal capacitors (from coupling between transistors, wires and silicon substrate). R models the wasted leakage power through the internal current paths through the dozens millions of transistors.

If the die temp is kept low enough, in theory, todays XP and P4 can be clocked as high as 3 GHz, 4 GHz. The power (the C component) going into the chip to run the clock at a frequency f and Vcore V is given by

P_active = C V^2 f

And this can go on to 3-4 GHz if the die is kept below certain temp. Most of the power are used to power the clock faster as Vcore is increased.

But in reality, for any cooling used, air, water, vapor, liquid nitrogen, ..., the die temperature will eventually increase as Vcore increases due to leakage current which heats up the chip. Though at a different rate depends on what cooling is used. The leakage current is small at low temp, and increases with temp increases and also at a faster rate as temp increases. The power that heats up the chip (the R component) is given by

P_leak = V^2 / R

From my experiment with the TB B 1700+ DLT3C, when die temp reaches around 40C, the chip leakage current begins to increase at a faster pace, and heats up the chip more, as well as due to the higher active power component P_active. Once this starts, any Vcore increase will heat up the chip at a faster pace. The exact Vcore when this occurs varies from chip to chip (100-200 mV difference), it depends on certain properties and characteristics ("gene") of how a particular CPU was born in silicon.

P = P_active + P_leak = CV^2 f + V^2 / R

After passing that temperature threshold, the portion P_leak going into heating the chip (the R component) will become larger and larger, as Vcore is increased. The additional power supplied to the CPU will be wasted as P_leak instead of going into the useful P_active. In other word, the useful P_active to power the chip faster (the C component) will increase at a diminishing rate. And the chip is just being heat up, and in turn slow down the chip, and cannot be clocked faster any more.

So on air, at lower temperature 10-20-30 C, the CPU can be clocked faster and faster at a rate about 130-140 MHz/100mV (for Tbred B, Barton). So far so good. But at the same time, the chip will begin to heat up due to Vcore increases (the V^2/R compoent and also from the active CV^2f component). As a result of heat, the electrons move slower inside the chip and the CPU begin to run slower, the above rate delf/delVcore begins to drop to 120 then 110 then 100 then 50 MHz/100 mV, ... when die temperature reaches beyond 30 C, 40 C, 50 C, ... correspondingly for Vcore above 1.7, 1.8, 1.9 V, ... The heat increases at a rate faster than the Vcore increase to slow down the chip.

This is what we call the diminishing return on CPU frequency. And eventually, around 1.95 - 2 V for Tbred B, 1700+ DLT3C, it will come to "stop" (due to heat, high current and system instability) even when more Vcore is put in, since the heat slows the chip down as electron mobility decreases as temperature increases. There is no more reason to increase Vcore anymore (even you don't kill the chip). For higher Vcore rated ones such as 2100+, Barton 2500+, that Vcore wall is around 2.05 - 2.2 V on air.

The above numbers are mainly for illustration, and they are roughly correct. But don't quote and use them for exact calculation.

If you use thermoelectric, phase change, .... exterme cooling, due to the lower die temperature, as mentioned above, the chips can run much faster and reach much higher frequency (e.g. 3+ GHz) at the same Vcore (compared to air/water) before the die reaching the higher temperature as cooled by air. E.g. at 1.95-2V 1700+ will run at 2.5-2.6 GHz on air at 50 C, but it will run at 3 - 3.2 GHz at -10 C.

It does not mean you can put much higher Vcore onto the chips at lower die temperature. Vcore is subjected to transistor leakage increase, gate breakdown constraints. They run faster is a combination of higher active power to substain the computation (both logical and electrical) and lower die temperature, not higher Vcore alone.

What is the active power of a CPU at frequency f and voltage V

When a capacitor C is charged to charge Q within a time T by a current I. Let V be the voltage across the capacitor

Q = C V
I = Q / T
I = C V / T

If the capacitor is charged repetitively by a clock of frequency f of period T, since f = 1/T. So

I = C V f

Hence the current is proportional to the clock frequency f.

So the shorter the period T (faster clock), the bigger the current I. Keeping V constant and all the capacitance inside the chip remain constant (1st order).

When the clock is double, the shape of the pulse (described by the aspect ratio or duty cyle) remains the same, i.e. the high (logic 1) and low (logic 0) intervals remain the same (1st order), generated by the internal clock and pulse generators.

It is correct that when the clock is double (frequency is twice), the pulse width is half, as a consequence, it takes half the time to charge up the same capacitance with the same voltage V, HENCE the current is DOUBLE (and not halved), because I = C V / T. (In this paragraph, V is kept constant. It will be more current if V is also increase.)

Further for power,

P = V I = C V^2 f

That explains why going from 1.5V 1.5 GHz to 1.65V 2.4GHz, the active power is almost double.

P2 / P1 = 1.65 x 1.65 x 2.4 / (1.5 x 1.5 x 1.5) = 1.94

And when going from 1.5V 1.47MHz to 2.2V 2.8GHz as Russell_hq is doing (see earlier post), the active power would be about 4x !! and the active current would be around 2.8x !! (CHECK the PSU !!!!)

As far as whether the transistors can keep up when the clock is increased. It always won't and will fail at certain freq. This is a separate issue. But as long as it can run at that freqency with a certain Vcore increase, the above current and power estimate hold.

How to estimate CPU static and active power

The power consumed by a CPU is

Power = Power_static + Power_active

The Power_active is proportional to C V^2 f, where V is the Vcore and f is the clock freqency, C is the equivalence capacitor respresenting the chip. The active power is for doing the computation by charging (and discharging) the 100's millions of small capactors (from coupling among transistors, metal wires, silicon substrate) inside the chips through the 10's of millions of fet transistors.

The Power_static is about 10-18% of the total power. It is for biasing the millions of transistors as leakage current. It changes with V^2 where V is the Vcore. The leakage current path can be modeled by a resistor R, consuming a static power of V^2/R. Also this current is sensitive to temperatue, it increases as die temperature and Vcore increase during oc.

Using a simplifed picture, the CPU can be viewed macroscopically as a resistor R and a capacitor C in parallel for power and current estimate. (There is also inductor in series, but skip it for this discussion).

From the air cooling example, going from 1.5V 1.5 GHz to 1.65V 2.4GHz,

OC Pactive = (1.65 * 1.65 * 2.4)/(1.5 * 1.5 * 1.5) = 1.93 (93 % increase) <---- answer (c) on active power, which is almost 100%

Assuming 10% of total power is static power.
OC Pstatic = 1.1 * 1.1 = 1.21 (21 % increase)

Going into more detail about total power, weighing with both static and active power,
Total OC power = 1.21 * 0.1 + 1.93 * 0.9 = 1.86 (86% increase)

We really have to adjust our thinking for these high current CPU oc to run at high clock rate. Especially picking up the right PSU, so the 5V/12V line have sufficient curent to maintain steady Vcore. It is very differnet than 6-12 months ago, when these things are at 2GHz range.

hitechjb1 · May 3, 2003

Why the 1700+ can run so fast at low Vcore?

The Tbred B 1700+ DLT3C is based on the same 0.13 micron bulk silicon process as all the other model 8 (Tbred A and Tbred B) from XP 1600+ to 2800+ (recently 3000+). (BTW, Tbred B has one more metal layer than Tbred A, both are 0.13u.)

The hammers (Opteron, Athlon 64) are based on 0.13 mircro SOI process, will go to 0.09 mircro eventually.

The Tbred B 1700/1800+ have the same transistor count, same L1, L2 cache size, same number of metal layers, same chip dimensions, ... as the 1.6 and 1.65 V rated Tbred B.

Side track: Same for Barton, which is also based on the 0.13 micron process. But it is a different chip, different transistor counts, chip dimension and has bigger L2 cache of 512KB instead of 256KB in Tbred.

I think the reason why the Tbred B 1700+ DLT3C can work at rated 1.5V and can be clocked at simliar highest clock frequency (if not better) as all the other 1.6V and 1.65V rated Tbred B is due to the following:

Its transistors have lower threshold characteristics due to process variation which produces transistors with shorter channel length. Shorter channel means lower transistor threshold, runs faster, draws larger leakage current and higher active current (hence higher active power). According to AMD spec, the 1.5V 1700+ has higher rated current than the 1.6V 1700+ (about 7% more).

Threshold voltage of a transistor is the gate voltage above which the transistor will conduct current orders of magnitude higher from source to drain compared to that below the threshold. Chips with lower threshold transistors can perform equally well with a lower supply voltage (Vcore) as those with higher threshold, because the transistors can conduct at a lower gate voltage.

This is normal for a given silicon process (say 0.13u) to have such variation that some transistors in certain chip die have shorter channel length (less than 0.13u) or some have longer channel length. Those that have shorter channel length have faster intrinsic speed and can run as fast when smaller Vcore is applied (pros). On the other hand (cons), due to the lower threshold voltage which draws higher leakage current and generates more heat at the same higher Vcore, these chips can run as fast at a low Vcore as the higher Vcore rate chips, but they will max out at a lower Vcore compared to the higher Vcore rated siblings.

The 1700+ has a run-away current at a lower Vcore compared to the 2100+. Run-away current refers to the leakage current and the heat generated positively feeding each other resulting instability.

The final oc success of the Tbred B 1700+/1800+ DLT3C is a race between its natural, born, intrinsic characteristics, the balance and tradeoff between the smaller channel length, lower transistor threshold, hence faster, and the opposing, negative behaviour of higher leakage current, and heat generated.

hitechjb1 (04-06-2003 05:30 PM in Difference DLT3C / DUT3C ???)[/i] Posted in another thread said:
If you can see the chip die physically, as someone already pointed out, palomino is more square shape like and the Tbred A and B are more rectangular like.

The palomino are bigger in die size 128 mm^2, compared to 80 and 84 mm^2 for Tbred A and B.

If you can see the transistors and can count them, the palomino has 37.5 million, whereas the Tbred A and B have 37.2 and 37.6 millions respectively.

If your eye is powerful enough and able to go inside the chip and see the transistors, the palomino transistors have a width of 0.18 micron and whereas the Tbred transistors have a width of 0.13 micron.

If you are able to count the number of metal wires, the palomino has 7 layers, whereas Tbred A has 8 layers and Tbred B has 9 layers.

OK, back to business. Since we probably can't see these with our naked eyes, ...

But if the chip is already covered by the HSF, and you cannot see the die shape, the stepping, ... This is the trick to find out:

Palomino has only one default Vcore of 1.75V. Tbred A and Tbred B has multiple default Vcore of 1.5 (the famous 1700+ DLT3C), 1.6 and 1.65 V depends on the PR rating of the Tbred.

You can get into the bios, set the Vcore to default.

If it says 1.75 V then it is a palomino.

If it is 1.5 or 1.6 or 1.65 V, it is a Tbred A or B.

hitechjb1 (04-30-2003 04:33 PM) said:
What are the difference between Palomino, Tbred A and Tbred B/Barton?

If you can see the chip die physically, as someone already pointed out, palomino is more square shape like and the Tbred A and B are more rectangular like.

The palomino are bigger in die size 128 mm^2, compared to 80 and 84 mm^2 for Tbred A and B.

If you can see the transistors and can count them, the palomino has 37.5 million, whereas the Tbred A and B have 37.2 and 37.6 millions respectively.

If your eye is powerful enough and able to go inside the chip and see the transistors, the palomino transistors have a width of 0.18 micron and whereas the Tbred transistors have a width of 0.13 micron.

If you are able to count the number of metal wires, the palomino has 7 layers, whereas Tbred A has 8 layers and Tbred B has 9 layers.

OK, back to business. Since we probably can't see these with our naked eyes, ...

But if the chip is already covered by the HSF, and you cannot see the die shape, the stepping, ... This is the trick to find out:

Palomino has only one default Vcore of 1.75V. Tbred A and Tbred B has multiple default Vcore of 1.5 (the famous 1700+ DLT3C), 1.6 and 1.65 V depends on the PR rating of the Tbred.

You can get into the bios, set the Vcore to default.

If it says 1.75 V then it is a palomino.

If it is 1.5 or 1.6 or 1.65 V, it is a Tbred A or B.

A more interesting question is why Tbred B, in particular the Tbred B DLT3C, being rated at lower Vcore but it can run faster than other Tbred B at same voltage. The main reason I think is due to lower transistor threshold and shorter channel length. Even it is manufactured with 0.13 micron like other Tbred B, it is effectively behaving like a chip with less than 0.13 micron, if you like.

link: Why the 1700+ can run so fast at low Vcore?

Originally posted by hitechjb1
I don't think the nominal p:n ratio of individual transistors in the Tbred B 1700+ DLT3C are any different than the DUT3C and DKT3C Tbred B siblings, and play a role in its relative better speed at same voltage. They are from the same circuit design and wafer mask. So I think the speed gain of the DLT3C is mainly due to reduce in transistor threshold voltage and channel length, a result of process variation. Such deviation is only marginal and is less drastic as an actual change from technology scaling, such as going from 0.18 micron to 0.13 micro to 0.09 micron.

The sub-threshold leakage current increase due to process variation (say 5-10% threshold voltage), though exponential in nature with voltage and temperature, would be much smaller (say 12-26% estimated). Further such leakage current, I estimated, is around 10-15% of the total CPU current. As a result, the increase in leakage current is estimated to be 1-4% (within the 7% higher current I quoted from the spec for the 1700+ DLT3C) of the total chip current.

Link:

my theory on the new dlt3c's

Would the 1.5V DLT3C overclocks higher than the 1.6V DUT3C, why?

Difference DLT3C / DUT3C ???

At stock 1.5V 1467 MHz, power = 50W
As reported, some can run at 1.06V, 1467 MHz, for active power
active power = 50 x 1.06 x 1.06 x 1467 / (1.5 x 1.5 x 1467) ~ 25W (half that at rated Vcore) !!!!

hitechjb1 · May 3, 2003

How to interpret Sandra CPU benchmark, IPC and comparing Tbred B/Barton with P4

Let's look at the integer and floating point calculation benchmarks:

The Sandra arithmetic (integer and floating point) benchmark, like other CPU benchmark (e.g. SPEC), is a set of predetermined programs chosen by some means, to test the raw CPU speed. So the results from different benchmarks can differ between a given CPU and system.

For the same benchmark such as Dhrystone, Whetstone, SPEC, ..., the result is mainly impacted by the raw CPU clock frequency (for a given CPU architecture) and cache size. So for the same CPU type, e.g. TBred B, Barton, P4, ..., the benchmark results for integer and floating per clock frequency is almost a constant.

For Tbred B and Barton:
Sandra integer Dhyrstone IPC (instructions per cycle) = DMIPS / frequency = 3.76 instructions / cycle
E.g.
- XP 1700+ DLT3C (256KB L2) = 9560 / 2558 = 3.74 instructions / cycle (tested result)
- Barton 3000+ (512KB L2) = 8130 / 2160 = 3.76 instructions / cycle
- XP 2600+ (256KB L2) = 7829 / 2080 = 3.76 instructions / cycle

Sandra floating point Whetstone IPC = MFLOPS / frequency = 3180/2080 = 1.53 instructions / cycle

For P4:
P4B Dhrystone integer IPC = 8164/3060 = 2.668
P4B Whetstone floating point IPC = 1717/3060 = 0.561 (w/o SSE2)
P4B Whetstone floating point IPC = 4009/3060 = 1.310 (w/ SSE2)

Ratio between XP to P4B:
Dhyrstone integer IPC = 1.41:1
Whetstone floating point IPC = 2.73:1 (w/o SSE2), 1.17:1 (w/ SSE2)

For P4 w/ 2 SMT (symmetric multi-threading):
P4B Dhrystone integer IPC = 8957/3060 = 2.927
P4B Whetstone floating point IPC = 2632/3060 = 0.86 (w/o SSE2)
P4B Whetstone floating point IPC = 5738/3060 = 1.875 (w/ SSE2)

Ratio between XP to P4B w/ 2 SMT:
Dhyrstone integer IPC = 1.28:1
Whetstone floating point IPC = 1.78:1 (w/o SSE2), 0.816:1 (w/ SSE2)

Hence, a Tbred B running at 2.5 GHz is as fast as
- a P4 3.5 GHz (= 2.5 x 1.41)
- a P4 w/ 2 SMT 3.2 GHz (= 2.5 x 1.28)
running in for integer computation in terms of raw CPU power,

In other words, AMD XP CPU does more integer instructions per clock than P4, a ratio around 1.41:1 or 1.28:1 (for P4 w/ 2 SMT).

Since the IPC ratio for integer calculation between XP and P4 is 1:28.

So for a P4 w/ 2 SMT to deliver an integer D-MIPS corresponding to the 1700+ overclocked to 2.558 GHz, the P4 w/ 2 SMT has to run at 1.28 x 2.558 = 3.27 GHz.

And for the earlier 2518 MHz stable result, P4 w/ 2 SMT has to run at 3.22 GHz for the same integer MIPS.

The highest stock frequency for the current available P4 w/ 2 SMT is 3.06 GHz. So for the sake of illustration, this 1700+ ($65) overclocked to 2.558 GHz is already achieving what P4 3.06 GHz does at stock (and change, actually equating to the raw CPU of a 3.27 GHz P4).

Of course, if P4 oc, it will be higher also. I think AZN has shown in another thread that a P4 running at 3.42 GHz with oc. Some P4 may go as high as 3.8 GHz, which will then take a XP to run at 3.8/1.28 = 2.97 GHz to break even in integer D-MIPS, and is doable for some XP chips.

P4 mb and chipset in dual channel can deliver 50-60% more effective memory bandwidth (** see add on) than nforce2 dual channel (taking into account memory controller overhead, ... for both chipsets).

For floating point MFLOPS, P4 has an IPC of 1.31 (w/ SSE2), or 1.875 (w/ SSE2 and 2 SMT), and AMD XP 1.53. That means for the same CPU clock rate, the XP to P4 is 1.17:1 or 0.816:1 (P4 w/ 2 SMT).
Without SSE2, P4 has an IPC of 0.561 or 0.86 (P4 w/ 2 SMT), and AMD XP 1.53, hence XP to P4 is 2.73:1 or 1.78:1 (P4 w/ 2 SMT).

This post is just to show that the D-MIPS delivered by the Tbred B 1700+ DLT3C ($65) running at 2.56 GHz performs as good as a 3.2 GHz P4 in integer calculations. If using a P4 2.4 GHz to overclock at this level of perormance would cost about $157.

The highest stock clock rate P4 available today is 3.06 GHz (~ $390). The highest stock clock rate XP available today is a 2.16 GHz Barton 3000+ (~ $329).

So in both are run at stock clock speed like for manufacturers, the ratio 3.06 / 2.16 = 1.42. It is faster for P4 w/ 2 SMT, and slower for that without, when comparing to the XP.

Here I use only integer and floating point raw calculation for evaluating price, performance between CPU's. For other metric for multimedia, cache, memory, ... we can do similar things too.

I try to present some objective evaluations of CPUs, ... Questions and comments welcome.

** Addon for memory bandwidth:

The max bandwdith for DDR between memory controller and CPU would be 2 x 8 x FSB = 16 FSB MB/s. x2 is because of DDR (data are transferred at both rising and falling edge of the FSB clock, x8 because of 8-byte bus or 64-bit bus). The effective bandwidth, taking into memory controller (~95% efficiency), would be around 15.2 FSB. E.g. FSB = 200 MHz, effective bandwidth ~ 3040 MB/s.

Dual channel makes a big difference for P4 dual channel mb though, due to quad pump data of P4 (or QDR). The max bandwidth for P4 dual channel is 4 x 8 x FSB = 32 FSB MB/s. The effective bandwidth, taking into memory controller overhead (~ 75% efficiency), would be around 24 FSB MB/s.

For AMD dual channel, max bandwidth = 16 FSB, effective bandwidth ~ 15.2 FSB (95% efficiency). Hence the improvement of effective bandwidth = (24 - 15.2)/15.2 = 58% for P4 dual channel system over AMD dual channel.

E.g. FSB = 200 MHz, effective bandwidth ~ 4800 MB/s, which is around 60% more than that of a nforce2 mb running same FSB 200 MHz.
E.g. running fsb:memory=5:4, with FSB=250, memory=200, effective bandwidth ~ 24 x 225 = 5400 MB/s.

hitechjb1 · May 3, 2003

2539/2555 MHz at 1.94 V, SK7 + Vantec Tornado (3 posts)

Today, I change to a Vantec Tornado 80mm fan, I am able to run the TB B 1700 DLT3C stable up to 2555 MHz. Before was around 2520 MHz. HS is still the SK7.

The Vcore is set to 1.95V, tempearture is 42 C prime95 loaded, 36 C idle, system temp is 27 C.

Sandra put the model number as 3200+, and estimated PR rating as 3700+.

Just the integer benchmark D-MIPS of 9556 MIPS, it is like running a P4 at 3.26 GHz (= 9556/2.927) on integer computations. The 2.927 is the P4 IPC (instructions per cycle) number from Sandra Dhyrstone MIPS for iteger computation.

Throwing in $14 for the fan, and getting 35 MHz improvement. It costs $0.4 / MHz, very expensive. The original CPU chip is only $60/2400MHz = 2.5 cents/MHz !!!!

At almost 2V, it is not responding to Vcore (due to leakage current ?) even the die temp is only 42 C loaded. It looks like there is not much chance to get stable at the 2.6 GHz level.

It is drawing huge current as I observed the voltage fluctuation on the 5V line and Vcore. On the 5V line, there was about 80 mV fluctuation, and was 160 mV from no load, right over the edge of the PSU spec 3%. One slight chance things can be improved, the PSU and super cooling to cut down the current (leakage).

Links for power and current estimate, PSU requiremetns and discussions:
Question: How much power is increased ...
how many watts does your cpu take up?

Sandra CPU benchmark at 2555 MHz

Sandra multimedia benchmark at 2555 MHz

hitechjb1 · May 3, 2003

PCMark 2001 at 2539 MHz

hitechjb1 · May 3, 2003

Prime95 at 2539 MHz

Audioaficionado · May 3, 2003

My results so far at stock voltage

I tried to go with 200 fsb 2200 core at stock 1.5v but it wouldn't boot into w2k so I backed off to 200/2100/1.5v and it seems OK so far.

I'm not going past stock voltage as I don't want any long term layer migration problems. I want this rig to last for at least a few years, not mere months.

I'm running FAH right now which runs the CPU at a constant 100% useage. It's idle cycles like Prime 95 so it seems to be an equal stress test.

What can you use to stress and test the whole system with meaningful results, not just a bragging score.

[color=ff00aa]Edit: I just had a lockup so I backed down a little. I'm very satisfied as I've reached my goal of 200/2000 on stock voltages.[/color]

hitechjb1 · May 3, 2003

I agree operating the CPU like that at such high Vcore (30% above nominal), with high fan noise (for air cool) for 100-200 MHz gain is not totally justify for price-performance reason, for the long term "health" of the CPU (electro-migration, leakage current and heat stress), and enviornmental reasons (such as noise, heat), ....

It is for learning overclocking technique and getting some good bencmark and satisfaction (bragging as you said).

But actully, I learn quite a bit with this hand on experience of the low Vcore rate chip, and its high current behaviour (its rated current at 1.5V is higher than the 1.6V 1700+ by 7%). I am amazed by this chip from day one, and try to explain why and how it works so fast with smaller voltage (but it turns out that it will max out sonner than the other). I posted some explanation of it in another thread: my theory on the new dlt3c's

This is the first time I did overclocking of a CPU at this aggressive level. Normally I just oc 10-20% even just using stock HSF. I learn many things such as how clock frequency, die temperature, cooling, fan speed and Vcore vary, looking into heat sink and fan, PSU characteristics and picking, ... how heat, die temp changes w/ Vcore, ..., and many many things and from this forum.

To the end, I think it is not totally justify to stress the CPU like this for the last 100 - 200 MHz since I have to run at a Vcore 1.95 V (30% above nominal). It is better off just to run it at 1.65V (10% above nominal) at 2400 MHz, which is the best, long term operating point of this particular CPU. From 2400 MHz to the current 2550 MHz, it is very expensive in terms of price-performance, noise, voltage and heat control.

When this is done, the CPU will most likely go back to a more nature frequency and settings.

Audioaficionado · May 3, 2003

I'm glad you pushed it to the limit so we can see what's possible. This was very educational and maybe a sticky might be a way to keep it easily accessable for the forum.

BTW what was your stable limit at stock voltages?

Overclocking Sandbox: Tbred B DLT3C 1700+ and Beyond

Senior Member

Member

Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Member

Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Sparkomatic Moderator

Senior Member

Sparkomatic Moderator

Similar threads