9x333, 8x375, or 7x428 on a Q6600 - Which is faster?

graysky · Jun 11, 2007

What is a better overclock?

Good question. Most people believe that a higher FSB and lower multiplier are better since this maximizes the bandwidth on the FSB. Or is a low bus rate and higher multiplier better? Or is there no difference? I looked at three different settings on my Q6600:

9x333 = 3.0 GHz (DRAM was 667 MHz)
8x375 = 3.0 GHz (DRAM was 750 MHz)
7x428 = 3.0 GHz (DRAM was 856 MHz)

The DRAM:CPU ratio was 1:1 for each test and the voltage and timings were held constant; voltage was 2.25V and timings were 4-4-4-12-4-20-10-10-10-11.

After the same experiments, at each of these settings, I concluded that there is no difference for real world applications. If you use a synthetic benchmark, like Sandra, you will see faster memory reads/writes, etc. with the higher FSB values -- so what. These high FSB settings are great if all you do with your machine is run synthetic benchmarks. But the higher FSB values come at the cost of higher voltages for the board which equate to higher temps.

I think that FSB bandwidth is simply not the bottle neck in a modern system... at least when starting at 333. Perhaps you would see a difference if starting slower. In other words, a 333 MHz FSB quad pumped to 1333 MHz is more than sufficient for today’s applications; when I increased it to 375 MHz (1500 MHz quad pumped) I saw no real-world change; same result when I pushed it up to 428 MHz (1712 MHz quad pumped). Don’t believe me? Read this thread wherein x264.exe (a video encoder) is used at different FSB and multiplier values. Have a close look at the 3rd table in that thread and note the FPS (frames per second) numbers are nearly identical for a chip clocked at the same clockrate with different FSB speeds. This was found to be true of C2Q as well as C2D chips.

You can do a similar test for yourself with applications you commonly use on your machine. Time them with a stop watch if the application doesn’t report its own benchmarks like x264 does.

Some "Real-World" Application Based Tests

Three different 3.0 GHz settings on a Q6600 system were tested with some apps including: lameenc, super pi, x264, winrar, and the trial version of photoshop. Here are the details:

Test O/C 1: 9x333 = 3.0 GHz

Test O/C 2: 8x375 = 3.0 GHz

Test O/C 3: 7x428 = 3.0 GHz

Result: I could not measure a difference between a FSB of 333 MHz, 375 MHz, or 428 MHz using these application based, "real-world" benchmarks.

Since 428 MHz is about 28 % faster than 333 MHz, you’d think that if the FSB was indeed the bottle neck, the higher values would have given faster results. I believe that the bottleneck for most apps is the hard drive.

Description of Experiments and Raw Data

Lame version 3.97 – Encoded the same test file (about 60 MB wav) with these commandline options:

Code:

lame -V 2 --vbr-new test.wav

(which is equivalent to the old –-alt-preset fast standard) a total of 10 times and averaged play/CPU data as the benchmark.

Super Pi version 1.1 – Ran both the 1M and 2M tests and compared the reported total number of seconds to calculate as the benchmark.

x264 version 0.54.620 – Ran a 2-pass encode on the same MPEG-2 (480x480 DVD source) file twice and averaged the FPS1 and FPS2 numbers as the benchmark. In case you’re wondering, here is the commandline options for this encode, pass1:

Code:

x264 --pass 1 --bitrate 1000 --stats "C:\work\test-NEW.stats" --bframes 3 --b-pyramid --direct auto --subme 1 --analyse none --vbv-maxrate 25000 --me dia --merange 12 --threads auto --thread-input --progress --no-psnr --no-ssim --output NUL "C:\work\test-NEW.avs"

And for pass2:

Code:

x264 --pass 2 --bitrate 1000 --stats "C:\work\test-NEW.stats" --ref 3 --bframes 3 --b-pyramid --weightb --direct auto --subme 6 --trellis 1 --analyse all  --8x8dct --vbv-maxrate 25000 --me umh --merange 12 --threads auto --thread-input --progress --no-psnr --no-ssim --output "C:\work\test-NEW.264" "C:\work\test-NEW.avs"

The input avisynth script was:

Code:

global MeGUI_darx = 4
global MeGUI_dary = 3
DGDecode_mpeg2source("C:\work\test-new.d2v")
AssumeTFF()
Telecide(guide=1,post=2,vthresh=35) # IVTC
Decimate(quality=3) # remove dup. frames
crop( 2, 0, -10, -4)
Spline36Resize(640,480) # Spline36 (Neutral)

RAR version 2.63 – Had rar run my standard backup batch file which generated about 0.98 G of rars (1,896 files totally). Here is the commandline I used:

Code:

rar a -u -m0 -md2048 -v51200 -rv5 -msjpg;mp3;tif;avi;zip;rar;gpg;jpg  "e:\Backups\Backup.rar" @list.txt

where list.txt a list of all the dirs I want it to back up. I timed how long it took to complete with a stop watch. I ran the backup twice and averaged it as the benchmark.

Trial of Photoshop CS3 – I used the batch function in PSCS3 to batch bicubic resize 10.1 MP to 0.7 MP (3872x2592 --> 1024x685), then applied an unsharpen mask (60 %, 0.8 px radius, threshold 12), and finally saved as quality 8 jpg. In total, 57 jpg files were used in the batch. I timed how long it took to complete two runs, and averaged them together as the benchmark.

Here are the raw data if you care to see them:

nd4spdbh2 · Jun 11, 2007

intresting... btw what skin is that for windows looks pretty cool.

graysky · Jun 11, 2007

CodeOpus v3 blue scheme. Readme gives this url: www.Opusworks.org

Omsion · Jun 11, 2007

So essentially an average gain of 1% on encoding (333->428), but apparently nonexistant otherwise? Cool results, nontheless.

Sam__ · Jun 12, 2007

very interesting

SeasonalEclipse · Jun 12, 2007

Typically the higher FSB is more performance gain.. But this seems to be pretty close all around.

SuperDave1685 · Jun 12, 2007

Very interesting. I'd like to see someone confirm this on a C2D now... I kinda always assumed that no matter how you end up at your final clockspeed, whether it be higher FSB and low multiplier or vice versa, the end results would just about the same. Good article!

-Dave

Gautam · Jun 12, 2007

FYI at 428 the 1333 strap will be used making latency much higher than in the other tests.

If you're posting superpi results, you shouldn't be truncating the millisecond value. But 1M and 2M both fit in the cache and won't exibit a difference anyways. 8M and up is required.

However if those are the apps that you use in daily then there's no issue in running a slower front side bus. Good work though, must have taken plenty of time.

graysky · Jun 12, 2007

Omsion said:
So essentially an average gain of 1% on encoding (333->428), but apparently nonexistant otherwise? Cool results, nontheless.

Yeah, and I think that 1 % is probably within error. Doing these benchmarks, the C2Q/C2D really don't benefit from a higher bus. I'd like to see someone who does fold@home or whatever@home do the same experiment since those apps don't really use the HD at all... then again, superpi doesn't use the HD.

graysky · Jun 12, 2007

Gautam said:
However if those are the apps that you use in daily then there's no issue in running a slower front side bus.

That should really be the take home message in my opinion. After all, you don't run Sandra or memtest daily do ya?

Hazaro · Jun 12, 2007

Not much difference...

http://www.ocforums.com/showthread.php?t=512725

Gautam · Jun 12, 2007

graysky said:
That should really be the take home message in my opinion. After all, you don't run Sandra or memtest daily do ya?

I run superpi daily.

It's not even a joke these days.

Hazaro · Jun 12, 2007

Gautam said:
I run superpi daily.

It's not even a joke these days.

lmao!

I spend more time wondering about how to improve my rig than playing games on it...

orion456 · Jun 13, 2007

Most test show there is only a marginal difference between FSB 1066 and 1333 mhz. The CPU basically determines the final speed and it is the bottle neck now. Even larger cache is only making a few percentage points difference now because the CPU just can't do the calculations fast enough to keep up with the FSB.

Hard drives are a problem with some applications but the vast majority of information can be cached in the system to give lightning quick response times.

We need faster CPUs................I'm waiting :temper:

[Edit] Oops, I spoke too soon, apparently on some FAH applications, doubling the cache results in a 2x speed increase...so the cache can make a huge difference.

graysky · Jun 13, 2007

orion456 said:
Most test show there is only a marginal difference between FSB 1066 and 1333 mhz. The CPU basically determines the final speed and it is the bottle neck now. Even larger cache is only making a few percentage points difference now because the CPU just can't do the calculations fast enough to keep up with the FSB.

Hard drives are a problem with some applications but the vast majority of information can be cached in the system to give lightning quick response times.

Totally agree

FlahsMemory · Jun 13, 2007

The Hard drive is the biggest bottleneck. IMO it has been this way for the longest. I rather see more advancement in this field then any other field. I am just waiting for the industry to switch to better matured storage devices with no moving parts.

Veratule · Jul 11, 2007

Omsion said:
So essentially an average gain of 1% on encoding (333->428), but apparently nonexistant otherwise? Cool results, nontheless.

Also even with that barely 1% increase, the only thing that really increases between the 3 tests is the voltages. That's a lot of extra heat/power for 1%.

jason4207 · Jul 11, 2007

What about gaming, though? Are the results the same?

Are these bad-*** C2Ds now the bottleneck? Guess I'll have to go extreme cooling, and overclock the new e6750 to 5GHz+! :attn:

graysky · Jul 11, 2007

Veratule said:
Also even with that barely 1% increase, the only thing that really increases between the 3 tests is the voltages. That's a lot of extra heat/power for 1%.

That's the key! Everything is a risk/benefit analysis.

jason4207 said:
What about gaming, though? Are the results the same?

Good question. I'll leave it to someone else to test it out.

ChinStrap · Jul 11, 2007

Gautam said:
I run superpi daily.

It's not even a joke these days.

i feel it man, just like console in your sig G, i go for a personal record in 3dmark 06 everyday. when i'm not in COD2 or the forums here im stressing 3d.

9x333, 8x375, or 7x428 on a Q6600 - Which is faster?

Member

Member

Member

Member

Member

Member

Member

Senior Benchmark Addict

Member

Member

Member

Senior Benchmark Addict

Member

Member

Member

Member

Registered

Senior Member

Member

Member

Similar threads