The Relationship Between CPU Frequency And Performance. Critique, Also Answer Qs?

Excelsior · Mar 29, 2005

The Relationship Between CPU Frequency And Performance. Critique, Also Answer Qs?

The Relationship Between Processor Frequency and Performance.

Abstract

Those who work in the computer science field are always asking themselves one question, “how can I get more performance?” To satisfy these performance addicts and to stay profitable, Central Processing Unit (CPU) manufacturers have to ask themselves, “Are we manufacturing these CPUs in the most efficient manner?” Efficiency equates to money, and that’s why it’s important to understand the fundamentals of microprocessors and what laws govern them.

A CPU plays a pivotal role in anyone’s computer system. It’s what makes the computer tick, and it’s one of the most common culprits if a system is acting sluggish. The intention of this project was to find out what kind of relationship frequency and performance had, and if there is any way one could utilize this relationship to one’s advantage. The hypothesis theorized that the relationship between frequency (speed, measured in megahertz) and performance (as measured by benchmarking applications) would be absolutely linear.

The experiment was conducted in a relatively simple manner. A modern microprocessor was clocked at different frequencies, tested for performance with several benchmarking applications, and then it was rebooted, and the process was repeated for other frequencies up to 2400 mhz. Results were recorded after each step. The frequencies covered were, 1000 mhz, 1100 mhz, 1200 mhz, 1300 mhz, 1400 mhz, 1500 mhz, 1600 mhz, 1700 mhz, 1900 mhz, 2000 mhz, 2100 mhz, 2200 mhz, 2300 mhz, and 2400 mhz. This supplied a very wide range of frequencies, and was sure to provided an adequate range of data.

To see which model fit best, the exponential and linear models were made for each respective benchmark, then the r value, also called the correlation coefficient, was calculated. The Correlation coefficient shows how strongly a model actually fits the data. The results were quite controversial. While there were some benchmarks that favored a linear model, there were some benchmarks that in fact more closely modeled an exponential. While there may have been experimental errors, none could account for the consistent results.

While it may be tempting, it is impossible to draw any concrete conclusions from this data alone. That doesn’t make this experiment useless, in fact, quite the opposite. It has opened up a new subject of discussion, and opened the door to future testing. Hopefully, with more expansive and controlled tests, the correlation may finally be figured out once and for all.

Introduction

One of the demons plaguing computing is performance. Microprocessor manufacturers strive to release the best possible product, in a cutthroat race to the top, claiming the performance crown. But just what forces act upon this performance-oriented race? This experiment intends to determine whether a microprocessor’s frequency, as measured in megahertz (mhz), has a linear, or exponential relationship to performance, as measured by benchmarking applications.

Wikipedia.com defines a Central Processing Unit (CPU) as, “the part of a computer that interprets and carries out the instructions contained in the software.” Simply put, the CPU is the heart of the modern computer. When all parts of a CPU are on a single Intergrated Circuit (IC), it is referred to as a microprocessor. A CPU gets its clock speed from a simple equation, Front Side Bus (FSB) times Multiplier equals resulting frequency. (http://en.wikipedia.org/wiki/CPU)

While this may sound very complex, in reality it is not. Very simply, a CPU takes in raw instructions and data from the other parts of the computer, manipulates the data and runs processes as directed by other components, and outputs results. Whenever anything occurs on one’s system, the CPU has computed it. Almost no information is passed on to the user without first being manipulated by the CPU.

One of the main concerns of chip makers is clock speed. Clock speed is also dubbed frequency or operating speed and it is measured in megahertz. The faster the clock speed, the more MIPS, or “Millions Of Instructions Per Second”, a CPU can execute. CPUs will perform several different functions on data it obtains from the rest of the computer, and depending on desired output, it will manipulate it accordingly. Increase the frequency and it will be able to fetch more data at a time, and complete the computations faster.

If chip manufacturers are able to understand the relationship between frequency and performance, then they may be able to estimate future limitations and ceilings ahead of time, and engineer ways to overcome them.

Gordon Moore predicted the density of microprocessors will double every eighteen months. More chip density means smaller circuits in a smaller amount of space, allowing for even faster frequencies, thereby, more performance. This being an exponential equation, one may wonder of limitations to this law—and many people have. There are constant debates and contradicting articles on the Internet disputing the credibility of the law being applicable forever, and some in defense of it. Surely there must be some point where the equation can’t take any more—some point at which the chip manufacturers are unable to stuff any more transistors in an already tightly packed space. While that battle rages on, there are yet other frontiers, which need to be conquered. If there is a limit to how much we are able to get out of microprocessors, how will be make them more efficient, and faster? (http://www.intel.com/research/silicon/mooreslaw.htm)

There are many optimizations that chip manufacturers released to help increase performance as well—SSE, special instruction sets, larger on-die cache, the list goes on and on. The bottom line is they are not only blindly increasing megahertz. The question arises, disregarding the other exponential relationships in computing, is there a possibility that there is an exponential relationship occurring with the same chip that is clocked at different frequencies?

Thus, this experiment was formed. The hypothesis was that the relationship between processor frequency and resulting performance must be a linear relationship. Logic would tell one that the faster clocked the processor, the more instructions it will be able to perform. As you double the speed of the processor, you double the raw performance.

Experimental Design

The experiment was set up to be fairly easy, as any benchmarking would be energy consuming to the computer, not the researcher. The experiment required a test bed system. An AMD AthlonXP 1700+ Processor, an Epox 8rda+ motherboard, 2x256MB pc3200 Kingston RAM, an ATI Radeon 9800 Pro, two Western Digital 120GB 7200RPM drives, one self built case, and one self built watercooling system were used in this experiment. Watercooling was chosen for the fact that in such a system ambient temperature has the least effect on the temperature of the processor, thus eliminating one of the variables.

The benchmarks that were chosen were, Sisoft Sandra 2005, Pifast, Prime95, and ScienceMark2005. These benchmarks were chosen for their diversity in the way they stressed the microprocessor—Sisoft Sandra had an arithmetic as well as a multimedia mode, Pifast uses a method to benchmark how long the micrprocessor takes to calculate millions of digits of pi (For more on the method used for calculation of Pi see Appendix A), Prime95 used equations to find prime numbers, and ScienceMark fed the microprocessor equations to solve to gauge how long it took the microprocessor to solve several scientific equations. All of these benchmarking applications tax the CPU in a controlled way; they feed the CPU data, then record the amount of time it takes for it to output a result.

In preparation for the experiment, the system had all the startup applications disabled via msconfig (a system configuration utility), so there was nothing to call upon the CPU and impact results. To further eliminate any variables, the Vcore (voltage given to the CPU during use) was kept the same throughout the tests. FSB (Front Side Bus) was also kept at 200 MHZ. The first multiplier was booted at 5, giving the chip a 1000 MHZ clock speed. The system was then booted into Windows XP, at which point a program called Free Ram Optimizer XP was executed. To make sure all the systems had similar access to the same amount of RAM, Free Ram Optimizer XP was used to clear up 400 MB of system memory. Upon completion, Sisoft Sandra was loaded, and the Arithmetic and multimedia benchmarks were completed (see appendix D for screenshot). Afterwards, once again, Free Ram Optimizer XP was used before the next benchmark. Next, a set .bat file was used to execute Pifast (see Appendix C for a screenshot of Pifast in action), and have it calculate 33554432 digits of pi (see Appendix B for the contents of the .bat file). Afterwards, Free Ram Optimizer XP was executed to clear the memory of any lingering, unused information (see appendix E for screenshot). Prime95 was then opened, and benchmark mode was enabled. Prime95 recorded the times it took to calculate certain prime numbers at different settings (see appendix F for screenshot). Prime95 was then closed, and Free Ram Optimizer XP was opened for one last time. Science Mark 2005 was then opened, at which time the Molecular Dynamics program was loaded (see appendix G for screenshot). The standard model was rendered, and the program automatically recorded the time it took to render. Then Primordia (a sub application of Science Mark 2005) was run, which simulated different conditions upon Aluminum (see appendix H for screenshot). The last benchmark contained within Science Mark 2005 was a Cipher benchmark (see appendix I for screenshot). The computer recorded how long it took to cipher using default settings. Upon completion, the system was rebooted, and the multiplier was changed to 5.5, resulting in a 1100 MHZ frequency. This process was repeated, in intervals of .5 multiplier additions, until the frequency of 2400 MHz was obtained. At which point, the limitations of the equipment were prevalent, and the computer would not boot higher than that frequency. The only limitation that this method presented was the fact that there was no 9.0 multiplier enabled on the test motherboard, therefore it was impossible to test at 1800 MHZ. A listing of the procedure used while testing, a quick field reference, is listed in Appendix K.

Results
For a full listing of results put in data tables see appendix V.

All results were recorded, and statistical analysis was done upon them to find out the most appropriate model. Included with all results are two statistical model lines, one linear, and one exponential. Also calculated was the correlation coefficient (r) of the exponential and linear data sets to the original data. Appendix L shows the first set of data, Prime 95 Best Times (in ms) at all the tested frequencies.

Appendix M shows the results for the Pifast calculations, along with exponential and linear models. Appendix N represents the time taken by the different frequencies to do the Molecular Dynamics model, as well as corresponding statistical models. Appendix O shows the primordia calculation time for the varying frequencies. Appendix P shows the time taken to finish the Cipher routine. Appendix Q shows the amount of bandwidth yielded by the cipher routine. Appendix R shows the Sisoft Million Instructions Per Second (MIPS) benchmark results. This is a very basic means of measuring the raw amount of data a CPU can handle. Appendix S represents the Million Floating Point Operations Per Second the frequencies were able to churn out. Appendix T and U show the result of the multimedia benchmark, in IT per second.

Discussion

The hypothesis was that the performance would follow a completely linear relationship. As confusing as it may sound the results were very inconclusive. In some of the tests, such as cipher time, primordial time, Molecular Dynamics time, and Prime 95 time, the exponential model fit much better than the linear, while still not absolutely perfect. The r value tells us which model has a stronger relationship to the original data, and for all these the r value was closer to one with the exponential rather than the linear, indicating a strong exponential relationship. Every other benchmark that was encountered, the linear model fit perfectly, while the exponential model was quite inaccurate.

While the linear results seem logical, and perfectly fit, it is almost inexplicable why, in some tests, an exponential model fit better. One hypothesis might be the way in which the benchmarks were formed. The benchmarks that fit the linear model were more geared towards theoretical performance, while the other benchmarks were real world situations. This is definitely an open door to the future for more research as to why a CPU might behave exponentially in real world tests, while adhering to linear performance in theoretical benchmarking. Further research on the topic could lead CPU manufacturers to more efficient processors that get more done with less effort.

Unfortunately, as all experiments do, there were multiple possibilities for sources of error in this project. The most obvious error is an errant process running, taking precedence over the benchmarking program, and delaying, thus skewing, the results of the benchmark. Windows isn’t necessarily the best platform to run these tests, as there are many objects always running in the background other than the bare operating system, which may have been a source of error.

Another error may have been temperature. Despite best efforts to minimize the impact of varying temperatures with a watercooling system, the ambient temperature did fluctuate. Microprocessors operate at peak efficiency at cooler temperatures, therefore if it was colder within the room in which one test was conducted, it may yield slightly better performance than another test.

One other source of error may have come from the network. The system was plugged into the network at the time of benchmarking. Often, computers will receive ping requests from other computers, and normal networking duties need to be carried out. When a computer receives a ping request it is mandated to respond, and that response may have taken cycles away from the benchmark, which was running at the time.

If repeated this experiment would need a lot more expanding. Perhaps a better system than using “Free Ram Optimizer XP” to optimize the memory after every benchmark would be to restart the system after every benchmark, giving a similarly fresh system every time. Another factor to consider would be the limited scope of tests available. If expanded upon, a wide variety of tests should be used, and perhaps the research should be conducted in an environment-controlled facility. Another limitation was the CPU itself. Only one microprocessor was available at the time, the AMD Athlon XP. If more research were to be conducted, it may be beneficial to include many different CPU architectures to be tested, such as the AMD 64 bit, or the Intel P4.

Bibliography

xgourdon (2004). Algorithm used by PiFast. Retrieved November 20, 2004, from the World Wide Web: http://numbers.computation.free.fr/Constants/PiProgram/pifast.html

Science Mark 2 Team (2004) Benchmark HowTo’s. Retrieved November 21, 2004, from the World Wide Web: http://www.sciencemark.org

Nick Tredennick & Brion Shimamoto (2004) The death of microprocessors. Retrieved November 05, 2004, from the World Wide Web: http://www.embedded.com

Online Author (2004) Moore’s Law. Retrieved December 01, 2004, from the World Wide Web: http://www.intel.com/research/silicon/mooreslaw.htm

Online Author (2004) CPUs. Retrieved November 25, 2004, from the World Wide Web: http://www.wikipedia.org

Gordon Moore (2003) No Exponential is Forever … but We Can Delay ‘Forever’. Retrieved November 28, 2004, from the World Wide Web: ftp://download.intel.com/research/silicon/Gordon_Moore_ISSCC_021003.pdf

APPENDICIES

Appendix A

Algorithm used by PiFast
The program implements a Brent binary splitting method together with an efficient cache handling hermitian FFT to multiply big integers (NTT with several primes is used for huge computations). To compute p, it is based on the Chudnovsky formula
426880 _____ض10005
p = هn ³ 0 (6n)!(545140134n+13591409) (n!)3(3n)!(-640320)3n ,

which adds roughly 14 decimal digits by term.
PiFast also proposes a second method for verification, based on a Ramanujan formula
1 p = 2ض2 هn ³ 0 (4n)!(1103+26390n) 44n(n!)4 994n+2 ,

which adds roughly 8 decimal digits by term.
My experience is that a careful implementation of these techniques seems better than any other approaches (like AGM based formulaes for example) for reachable number of digits.
PiFast implements the computation of E with two formulas, namely
e = ¥ هn = 0 1 n! and e = ( ¥ هn = 0 (-1)n n! )-1.

From version 4.0, PiFast permits to compute a large family of user defined constants from linear combination of general hypergeometric series. The algorithm also uses binary splitting, which generalizes well to hypergeometric series.
(http://numbers.computation.free.fr/Constants/PiProgram/pifast.html)

Appendix B

The .bat file contained these simple parameters:
PiFast, version 4.3 (fix 1) (Copyright 1999-2003 Xavier Gourdon)
http://numbers.computation.free.fr/Constants/PiProgram/pifast.html
Menu :
[0] Compute Pi with Chudnovsky method (Fastest)
[1] Compute Pi with Ramanujan method
[2] Compute E by the exponential series exp(1)
[3] Compute E by the exponential series 1/exp(-1)
[4] Compute Sqrt(2) (useful for testing)
[5] Define your constant with hypergeometric series
[6] Compute a user constant from a .pifast file
[7] Decompress a result file
[8] Check a compress result Pi file
Enter your choice : 0

Choose your computation mode :
[0] standard mode (no disk memory used)
[1] basic disk memory mode (for big computations)
[2] advanced disk memory mode (for huge computations)
Enter your choice : 0

Number of decimal digits : 32M
(33554432 digits)
Possible FFT modes, with approximate needed memory :

FFT Size=4096 k, Mem=230640 K (Fastest mode)
FFT Size=2048 k, Mem=148720 K (Time: Fastest mode * 1.1)
FFT Size=1024 k, Mem=107360 K (Time: Fastest mode * 1.3)
FFT Size= 512 k, Mem=86880 K (Time: Fastest mode * 1.7)
FFT Size= 256 k, Mem=76440 K (Time: Fastest mode * 2.7)
FFT Size= 128 k, Mem=71320 K (Time: Fastest mode * 4.7)
...

Enter FFT Size in k :4096

Compressed output (also useful to specify output format) ? [0=No, 1=Yes] : 0

Basically, the .bat file just specified the fastest way to compute pi, told it to use no disk memory (as that would be another bottleneck and possible variable), compute 33554432 digits, and to use 230,640 K of memory.

Appendix C

Screenshot of pifast’s completed computation with CPU information provided by WCPUID on top.
Appendix D

Screenshot of Sisoft Sandra Arithmetic Benchmark.

Appendix E

Screenshot of Free Ram Optimizer XP

Appendix F

A picture of Prime95 Benchmarking.
Appendix G

A Screenshot of the molecular dynamics program in action.

Appendix H
A shot of the primordial simulation.
Appendix I
Screenshot of cipher in action.
Appendix K
Variable issues:
Temperature
Program Error
Network computation

Procedure:

Boot At specified frequency.

Use "Free Ram Optimizer XP" Free up 300.00 MB of memory before benchmark load

WCPUID ON SIDE OF SCREEN!

---
Open Sisoft Sandra
Run Sisoft Sandra CPU Arithmetic Benchmark

SCREENSHOT! File Name: SSSCPUABENCH$frequency$$DATE$.PNG

run Sisoft Sandra CPU Multi-Media Benchmark

SCREENSHOT! File Name: SSSCPUMMBENCH$frequency$$DATE$.PNG
+++++++++MOVE FILES TO F:\Benchmarks WITH APPROPRIATE AREA SAVED+++++++++

---

Use "Free Ram Optimizer XP" Free up 300.00 MB of memory before benchmark load

---

run f:\program files\pifast\PIFAST32M.bat

SCREENSHOT! COMPUTATION OK... File Name: PIFAST$frequency$$DATE$.PNG
CHANGE LOG FILE NAME! SAVE AS: PIFAST$frequency$$DATE$.TXT
+++++++++MOVE FILES TO F:\Benchmarks WITH APPROPRIATE AREA SAVED+++++++++

---

Use "Free Ram Optimizer XP" Free up 300.00 MB of memory before benchmark load

---

Run CPU Right Mark
Open prime95
Benchmark, make sure log is there

+++++++++MOVE FILES TO F:\Benchmarks WITH APPROPRIATE AREA SAVED+++++++++

---

Open Science Mark

run Molecular Dynamics Simluation: Defualt settings, 140 K Temperature

Run primordia Simulation: Aluminum

Run Cipher Benchmark

CHANGE LOG FILE NAME! SAVE AS: SM$frequency$$date$.txt
CONFIRM THAT ALL RESULTS ARE CONTAINED.

+++++++++MOVE FILES TO F:\Benchmarks WITH APPROPRIATE AREA SAVED+++++++++

Code:

Appendix V					
		Prime 95 Best Time (ms) 2048 K FFT	Linear Model	Exponential Model	
Frequency (in mhz)	1000	305.462	280.8938541	284.8999874	
	1100	281.174	271.4561845	272.8792246	
	1200	266.985	262.0185148	261.365653	
	1300	244.105	252.5808452	250.3378726	
	1400	231.534	243.1431755	239.7753864	
	1500	222.025	233.7055059	229.6585624	
	1600	213.761	224.2678362	219.9685967	
	1700	201.95	214.8301666	210.687479	
	1800		205.3924969	201.7979588	
	1900	187.595	195.9548273	193.2835134	
	2000	181.573	186.5171576	185.1283173	
	2100	178.285	177.079488	177.3172128	
	2200	170.776	167.6418183	169.8356816	
	2300	168.605	158.2041487	162.669818	
	2400	163.213	148.766479	155.8063032	
Y= Equation			y=0.0943766965*X+375.2705506	y=438.4418548*0.9995690039^X	
Rvalue			0.966275416	0.983087601	Best Model: Exponential
					
		Pifast Calculation Time For 35M digits of Pi (seconds)	Linear Model	Exponential Model	
Frequency (in mhz)	1000	420.78	400.728694	406.1926074	
	1100	418.31	388.2474392	390.4816981	
	1200	368.28	375.7661844	375.3784629	
	1300	346.31	363.2849296	360.8593977	
	1400	330.61	350.8036748	346.9019077	
	1500	351.5	338.32242	333.4842721	
	1600	303.55	325.8411652	320.5856102	
	1700	292.78	313.3599104	308.1858488	
	1800		300.8786556	296.2656912	
	1900	274.41	288.3974008	284.8065871	
	2000	295.5	275.916146	273.7907036	
	2100	263.39	263.4348912	263.2008976	
	2200	248.16	250.9536364	253.0206892	
	2300	246.55	238.4723816	243.2342357	
	2400	239.39	225.9911268	233.8263072	
Y= Equation			y=0.124812548*x+525.541242	y=602.6224854*0.9996056143^x	
Rvalue			0.957089526	0.968121161	Best Model: Exponential

Molecular Dynamics		Calculation Time (seconds)	Linear Model	Exponential Model	
Frequency (in mhz)	1000	200.4713	180.5514806	188.1055998	
	1100	184.01943	171.9272875	175.133709	
	1200	162.73004	163.3030944	163.0563688	
	1300	149.60813	154.6789013	151.8118902	
	1400	136.93067	146.0547082	141.3428385	
	1500	129.36831	137.4305152	131.5957397	
	1600	117.42824	128.8063221	122.5208076	
	1700	108.35438	120.182129	114.0716889	
	1800		111.5579359	106.2052273	
	1900	95.24033	102.9337428	98.88124225	
	2000	91.79168	94.3095497	92.06232421	
	2100	84.85377	85.68535661	85.71364341	
	2200	80.69694	77.06116352	79.80277198	
	2300	76.39628	68.43697043	74.29951829	
	2400	73.28478	59.81277734	69.17577273	
Y= Equation			y=-.0862419309*x+266.7934115	y=384.3452462*.9992857175^x	
Rvalue			0.969827086	0.992891783	Best Model: Exponential
					
Primordia (aluminum)		Calculation Time (seconds)	Linear Model	Exponential Model	
Frequency (in mhz)	1000	28.9436	26.69250317	27.33774747	
	1100	26.88128	25.64557487	25.91462602	
	1200	24.71778	24.59864657	24.56558802	
	1300	23.06108	23.55171827	23.28677691	
	1400	21.43038	22.50478997	22.07453688	
	1500	20.37451	21.45786167	20.92540245	
	1600	19.24684	20.41093337	19.83608853	
	1700	18.07063	19.36400507	18.80348103	
	1800		18.31707677	17.82462799	
	1900	16.36488	17.27014847	16.89673111	
	2000	15.69712	16.22322017	16.01713778	
	2100	15.17129	15.17629187	15.18333345	
	2200	14.6119	14.12936357	14.39293447	
	2300	14.00373	13.08243527	13.64368131	
	2400	13.5671	12.03550697	12.93343203	
Y= Equation			y=-.010469283*x+37.16178617	y=46.65954972*.9994655337^x	
Rvalue			0.973713895	0.991162496	Best Model: Exponential
					
ScienceMark Cipher Sec)		Cipher Time (seconds)	Linear Model	Exponential Model	
Frequency (in mhz)	1000	28.66301	26.34561101	27.16241494	
	1100	26.50202	25.22812051	25.58257982	
	1200	24.27935	24.11063001	24.09463193	
	1300	22.44325	22.99313951	22.69322687	
	1400	20.87407	21.87564901	21.37333109	
	1500	19.54407	20.75815851	20.13020381	
	1600	18.35992	19.64066801	18.95937997	
	1700	17.24314	18.52317751	17.85665422	
	1800		17.40568701	16.81806581	
	1900	15.43873	16.28819651	15.83988433	
	2000	14.61881	15.17070601	14.91859638	
	2100	13.96487	14.05321551	14.05089288	
	2200	13.38392	12.93572501	13.23365723	
	2300	12.81679	11.81823451	12.46395407	
	2400	12.31024	10.70074401	11.73901881	
Y= Equation			y=-.011174905*x+37.52051601	y=49.45483921*.9994009538^x	
Rvalue			0.975252583	0.993499578	Best Model: Exponential

ScienceMark Cipher (MB)		Cipher Bandwidth (MB/s)	Linear Model	Exponential Model
Frequency (in mhz)	1000	53.29	52.7499795	56.19132654
	1100	57.58	57.8579206	59.65974306
	1200	62.85	62.9658617	63.34224801
	1300	67.99	68.0738028	67.25205603
	1400	73.1	73.1817439	71.40319743
	1500	78.07	78.289685	75.81056854
	1600	83.11	83.3976261	80.48998518
	1700	88.49	88.5055672	85.45823937
	1800		93.6135083	90.73315967
	1900	98.83	98.7214494	96.33367507
	2000	104.38	103.8293905	102.279883
	2100	109.27	108.9373316	108.5931213
	2200	114.01	114.0452727	115.296045
	2300	119.05	119.1532138	122.4127075
	2400	123.95	124.2611549	129.9686469
Y= Equation			y=.051079411*x+1.670568502	y=30.87083292*1.00059913^x	
Rvalue			0.999927074	0.993926226	Best Model: Linear

sisoft Sandra Arithmetic Benchmark		Arithmetic Benchmark (MIPS)	Linear Model	Exponential Model	
Frequency (in mhz)	1000	4159	4112.970295	4405.020146	
	1100	4499	4531.984891	4685.756818	
	1200	4991	4950.999488	4984.385141	
	1300	5357	5370.014085	5302.045368	
	1400	5812	5789.028682	5639.950423	
	1500	6166	6208.043278	5999.39053	
	1600	6586	6627.057875	6381.738143	
	1700	7043	7046.072472	6788.45318	
	1800		7465.087068	7221.088604	
	1900	7894	7884.101665	7681.296348	
	2000	8294	8303.116262	8170.833628	
	2100	8653	8722.130858	8691.569645	
	2200	9197	9141.145455	9245.492728	
	2300	9589	9560.160052	9834.717925	
	2400	9986	9979.174649	10461.49508	
Y= Equation			y=4.190145967*x+-77.17567222	y=2374.813069*1.000618017^x	
Rvalue			0.99981704	0.994394962	Best Model: Linear

Sisoft Sandra Arithmetic Benchmark (MFLOPS)		Arithmetic Benchmark (MFLOPS)	Linear Model	Exponential Model	
Frequency (in mhz)	1000	1601	1594.966197	1706.653692	
	1100	1758	1757.569014	1815.690011	
	1200	1916	1920.171831	1931.692546	
	1300	2076	2082.774648	2055.106362	
	1400	2249	2245.377465	2186.404958	
	1500	2393	2407.980282	2326.092083	
	1600	2554	2570.583099	2474.703673	
	1700	2782	2733.185915	2632.809902	
	1800		2895.788732	2801.017372	
	1900	3049	3058.391549	2979.971442	
	2000	3211	3220.994366	3170.358699	
	2100	3381	3383.597183	3372.9096	
	2200	3557	3546.2	3588.401266	
	2300	3706	3708.802817	3817.660469	
	2400	3869	3871.405634	4061.566802	
Y= Equation			y=1.626028169*x+-31.06197183	y=918.7183851*1.000619502^x	
Rvalue			0.99977496	0.993364412	Best Model: Linear

Sisoft Sandra MultiMedia Benchmark Integer x4 aEMMX/aSSE (it/s)		Integer x4 aEMMX/aSSE (it/s)	Linear Model	Exponential Model	
Frequency (in mhz)	1000	9262	9232.395391	9891.067441	
	1100	10203	10182.13214	10526.27378	
	1200	11156	11131.86889	11202.2732	
	1300	12038	12081.60563	11921.68544	
	1400	13053	13031.34238	12687.29848	
	1500	13944	13981.07913	13502.07933	
	1600	14837	14930.81588	14369.18559	
	1700	15889	15880.55263	15291.97757	
	1800		16830.28937	16274.03145	
	1900	17815	17780.02612	17319.15302	
	2000	18739	18729.76287	18431.39251	
	2100	19710	19679.49962	19615.06025	
	2200	20648	20629.23636	20874.74336	
	2300	21559	21578.97311	22215.32358	
	2400	22527	22528.70986	23641.99612	
Y= Equation			y=9.497367478*x+-264.9720871	y=5307.966561*1.000622617^x	
Rvalue			0.999965216	0.993795729	Best Model: Linear

Sisoft Sandra MultiMedia Benchmark Floating-Point x4 aSSE (it/s)		Floating-Point x4 aSSE (it/s)	Linear Model	Exponential Model	
Frequency (in mhz)	1000	9922	9897.197443	10596.04679	
	1100	10914	10909.54008	11274.17985	
	1200	11952	11921.88272	11995.71254	
	1300	12894	12934.22536	12763.42236	
	1400	13990	13946.56799	13580.26461	
	1500	14900	14958.91063	14449.38368	
	1600	15871	15971.25327	15374.12523	
	1700	17031	16983.59591	16358.04903	
	1800		17995.93855	17404.94265	
	1900	19053	19008.28118	18518.83609	
	2000	20004	20020.62382	19704.01724	
	2100	21056	21032.96646	20965.04843	
	2200	22055	22045.3091	22306.78396	
	2300	23054	23057.65174	23734.3888	
	2400	24062	24069.99437	25253.35846	
Y= Equation			y=10.12342638*x+-226.2289373	y=5698.137911*1.000620534^x	
Rvalue			0.999958386	0.993742568	Best Model: Linear

Excelsior · Mar 29, 2005

Rest of the apendicies.

Excelsior · Mar 29, 2005

Okay, anyhow. That was my computer science project. Let me know what you guys think, and I REALLY need some suggestions/explanations as to why the more synthetic benchmarks were linear, but others were exponential, or so it seemed. I've got to present this on sunday, so I'd much appreciate any help or criticism I could get. Oh, as always, feel free to quote, redistribute, or whatever to my paper, as long as you give myself, Rami Saikali, credit.

Thanks,

-Excelsior

SavageBasher · Mar 29, 2005

All I can say is..... Wow.

Excelsior · Mar 29, 2005

It's not that long. Main paper is only 8 pages double spaced, four single spaced. Also, the last line in the main paper about P4 and a64 being different ARCHITECTURE... just ignore it, I know they're both x86, just I needed a filler word and I couldn't find it at the time. It'll be changed most likely.

tenchi86 · Mar 29, 2005

Holy crap man thats nice, I am way to lazy to read all that. Also I am sure you covered this in there but from what I have read some CPUs like the presscot dont perform as well at low speeds, at higher speed the perform better though. Due to things like cache and fsb so its not just the clock freuquency. Any way nice work on papaer and good luck getting someone to read all that.

XWRed1 · Mar 29, 2005

I REALLY need some suggestions/explanations as to why the more synthetic benchmarks were linear, but others were exponential, or so it seemed.

I didn't read all of your stuff, but it seems like usually this is do to the cpu cache size and the data set the benchmark uses.

IMHO performance is proportional to frequency. And it's also proportional to efficiency (efficiency being measured by the amount of work done per clock cycle). Too many people want it to be ONLY one or the other, and then declare the side who's engineering teams favored the other to be wrong.

If Performance P = (Frequency F)(Efficiency E), then you are going to make up for a low F or a low E by boosting the other one. Surprisingly, this is exactly what Intel and AMD do. Intel went with a high-F/low-E design, and AMD went with a high-E/low-F design. They both yield about the same performance.

Like I said, I haven't read your paper, but if your paper tries to tackle this subject I hope it's coming out to the same answer.

I.M.O.G. · Mar 29, 2005

Please attach a doc version in a zip to your post, and I will take a look over the entire paper, then offer any advice I might have.

squasher · Mar 29, 2005

STICKY!!
I dont have time to read it right now...but damn!

tenchi86 · Mar 29, 2005

Its pretty big, maybe to big for a thumbtake to support. Just messing yea this may be a cool sticky.

Excelsior · Mar 30, 2005

I.M.O.G. said:
Please attach a doc version in a zip to your post, and I will take a look over the entire paper, then offer any advice I might have.

<3 you and anyone else willing to take even a few minutse to try to help

I'll upload and attach the .doc files, as well as one .xls with the graphs/data tables in a second hopefully.

Essentially, I knew there are quite a few Relationships in computing.. like moore's law which is exponential. So I wanted to know if the relationship between frequency in mhz, and performance, measured by benchmarks was linear or exponential. And in a lot I got mixed results. That's kinda where I left off. Some were REALLY linear, almost to a tee. And some had a VERY strong leaning towards exponential. This intrigued me, as I thought it'd be strictly linear.

zexmarquies01 · Mar 30, 2005

holy crap dude.

i jsut read all of that. and man its nuts. but its GREAT. this should not only prove a very interesting thread, but also feel that this is also sticky material. not sure which area exactly, but actual data on overclocking, and the difference with performace compaired to speed is great data man.

honestly, i'm not sure why some of the tests are different. as said before, on die cache and the way and how much the program uses can cause a huge difference in benchies.

my friend has a laptop, his cpu's frequency is lower than mine, but he has 2 megs of L2 cahce. I on the hand have 512k of cache. he can do a WU on boinc faster than i can.

Man, i really hope this project would end up with some answers. because this alone is propbably the best question i have ever seen on these boards, that had to do with overclocking. and causes everyone to think.

sorry i'm no real help on this. but i will definetly keep updated on this thread.

Excelsior · Mar 31, 2005

Okay, all done archiving (.rar) I can put it in .zip as well if you want...

Anyhow, the files in it are as follows:

1. Main Paper
2. Appendix A-K.doc
3. Appendix L-T (graphs).doc
4. Raw Data.xls

Then the rest is extra, as well as the fact that I included the working directory with ALL pifast logs in thier entirety, as well as TONS of screenshots that I never incorperated into the paper due to lenght issues.

http://excelsior.zerobrains.com/Computer Science.rar

There she is. It should be finished uploading in appproximately three minutes, then you can view it.

Excelsior · Mar 31, 2005

Just kind of commentary on usefullness.. While it might just seem kinda fun at first, it might actually be useful for predicting performance on a particular application. If it was possible to further tests, one could make a database of fucntions to predict performance of a particular CPU, paired with a particur FSB, with a particular size of memory. Right now via these graphs, I could tell you the performance of that chip on frequencies in excess of 4000 mhz, if it was possible for it to be reached, within 5%. That's the power of extrapolation and being able to read between the lines/interpert data. If it was possible to make a large function, having it be different for each benchmark... Say the function for time it takes to finish prime95 at 2048k FFt. The factors that'd go in at first are just 512 MB Ram, fsb, , frequency. However if we were able to gather more information, we could find out how less ram, and more fsb affected the performance, and make a function to compensate for this, letting uis predict performance for any frequency witht he same cpu, any memory size, and any fsb. Obviously within reason, there will be some error due to bottlenecks in different chipsets, etcetera, however usually these differences are negligible (<5%), and it might be very useful/accurate to develop such methods.

It also somewhat shows the power of overclocking. It's nice to see performance kind of laid it right in front of your face. The math behind it is definitely fun as well. If you're doing one of the benchmarks that followed a linear path, technically all you'd have to do was boot up at one frequency, say 1000 mhz, then again at another, say 1500, and I could make a graph that would tell you the performance of said benchmark at 2400, or beyond within a reasonable error.

Ij ust find it all kind of fun, I'd really like some critique though

Remove · Mar 31, 2005

While your methods and procedures are sound, I think you need to work more on getting the "control" portion of the testing into smaller parameters. (No networking, level temperatures for the cpu through whatever means necessary to achieve, etc) so that your resuts have less of a chance to be berated in any way possible. Having an absolute in each test would be paramount in a higher level thesis. (end of constructive critism)

Other than that I give major kudos to you and look forward to seeing more out of you in the future. I am fwd this to my son to read at his school.

Quailane · Mar 31, 2005

Pretty good. I have one question though. For one test that you did, tell me at what cpu speed would double that speed give a statistically insignificant boost in performance? That could be the absolute performance limit of the other hardware. To better investigate you could run an FX-55 at 3Ghz on a 128mb single stick of ddr200 pc1600 at 3-4-4-8 timings.

I like your project, but I don't think you let something limit the cpu performance like you should. If your ram and motherboard were good enough, you would not have experienced any non-linear performance changes. For all the test benchmarks you used, you would have gotten all non-linear performance changes if your data had a broad enough range.

Excelsior · Mar 31, 2005

Remove said:
While your methods and procedures are sound, I think you need to work more on getting the "control" portion of the testing into smaller parameters. (No networking, level temperatures for the cpu through whatever means necessary to achieve, etc) so that your resuts have less of a chance to be berated in any way possible. Having an absolute in each test would be paramount in a higher level thesis.

Other than that I give major kudos to you and look forward to seeing more out of you in the future. I am fwd this to my son to read at his school.

Aye, I know. Experimental Design was lacking. I identified literally dozens of variables. However, most of my results were very consisetent and didn't show signs of variation. Pifast however did incur the most variation, most likely due to memory issues. If I am able to repeat this most definitely I won't be running more than one benchmark on each bootup. The machine will be isolated, and while I did use watercooling to attempt to normalize the temperature, it will also be in a more temperature controlled evnironent. Pretty much It'd be a much longer procedure. Boot at frequency benchmark a single benchmark, reboot. Repeat for all benchmarks, then all frequencies. That'd be ideal. As well as including different CPUs like AMD64, Intel P4, as well as possibly Dual processors.

Excelsior · Mar 31, 2005

Quailane said:
Pretty good. I have one question though. For one test that you did, tell me at what cpu speed would double that speed give a statistically insignificant boost in performance? That could be the absolute performance limit of the other hardware. To better investigate you could run an FX-55 at 3Ghz on a 128mb single stick of ddr200 pc1600 at 3-4-4-8 timings. I like your project, but I don't think you let something limit the cpu performance like you should. If your ram and motherboard were good enough, you would not have experienced any non-linear performance changes. For all the test benchmarks you used, you would have gotten all non-linear performance changes if your data had a broad enough range.

Sure, and this is one of the things the test wanted to check out. Which benchmarks gain NO performance increace past a certain point, and when does other parts of your system become a big hinderance in performance, and when does adding more MHZ just become not-beneficial in the long run. Interesting thing I wanted more elaboration on was that synthetic benchmarks remained linear, while some real-world(ish) benchmarks showed signs of exponential regression. This would suggests that it might help numbers to throw tons more frequency out there, it might not continue helping performance on a lienar front.

XWRed1 · Mar 31, 2005

and when does adding more MHZ just become not-beneficial in the long run.

All things being equal, shouldn't the answer be "never"?

It's just another factor of performance. It should never hurt to take increase it, but real-world issues like heat give you a short-term limit. But assuming heat isn't a problem going to a certain clock speed, then it is always going to be a good thing because IPC and bandwidth of the various components of the system won't suffer for it.

This would suggests that it might help numbers to throw tons more frequency out there, it might not continue helping performance on a lienar front.

I think you can only extrapolate this if you can divine an equation expressing the performance of the cpu. The fact that it is sitting in a system that has a hard drive and ram and things like that in it, and that you are running benchmarks that might get bottlenecked by those things, doesn't help.

I.M.O.G. · Mar 31, 2005

Excelsior said:
Okay, all done archiving (.rar) I can put it in .zip as well if you want...

Anyhow, the files in it are as follows:

1. Main Paper
2. Appendix A-K.doc
3. Appendix L-T (graphs).doc
4. Raw Data.xls

Then the rest is extra, as well as the fact that I included the working directory with ALL pifast logs in thier entirety, as well as TONS of screenshots that I never incorperated into the paper due to lenght issues.

http://excelsior.zerobrains.com/Computer Science.rar

There she is. It should be finished uploading in appproximately three minutes, then you can view it.

I will take a look tonight if I get a chance and let you know what I think.

The Relationship Between CPU Frequency And Performance. Critique, Also Answer Qs?

Member

Member

Member

Member

Member

Member

Senior Member

Glorious Leader

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Senior Member

Glorious Leader

Similar threads