What is IPC and how to compare cycle or Hz for different CPU architectures
A CPU has many functional units, such as integer unit(s), floating point unit(s), instruction decode unit, control unit, instruction schedulers, register files, cache, ..., for executing instructions (compiled program codes) and performing computations. More and more functiions are integrated into a CPU as transistors are shrinked smaller in size in each new generation of silicon technology. As a result, multiple instructions can be executed during a CPU cycle.
IPC stands for instructions per cycle, the number of integer or float point instructions executed per clock cycle in a CPU. In CPU arithmetic benchmarking, a set of defined (CPU/cache intensive with minimal memory access) programs are executed to measure the average instructions per cycle.
Based on Sandra CPU arithmetic reference CPU (these numbers may vary for different Sandra versions, so don't take it as absolute).
XP Dhrystone integer IPC = 7829/2080 = 3.764
XP Whetstone floating point IPC = 3180/2080 = 1.529
Comparing with a P4B,
P4B Dhrystone integer IPC = 8164/3060 = 2.668
P4B Whetstone floating point IPC = 1717/3060 = 0.561 (w/o SSE2)
P4B Whetstone floating point IPC = 4009/3060 = 1.310 (w/ SSE2)
Ratio between XP to P4B (w/o SMT):
Dhyrstone integer IPC = 3.764/2.668 = 1.41:1
Whetstone floating point IPC = 1.529/0.561 = 2.73:1 (w/o SSE2), 1.529 / 1.310 = 1.17:1 (w/ SSE2)
Comparing with a P4C w/ 2 SMT,
P4B Dhrystone integer IPC = 9858/3200 = 3.081
P4B Whetstone floating point IPC = 4062/3200 = 1.269 (w/o SSE2)
P4B Whetstone floating point IPC = 7139/3200 = 2.231 (w/ SSE2)
Ratio between XP to P4C (w/ 2 SMT):
Dhyrstone integer IPC = 3.764/3.081 = 1.22:1
Whetstone floating point IPC = 1.529/1.269 = 1.21:1 (w/o SSE2), 1.529/2.231 = 0.69:1 (w/ SSE2)
Using another version of Sandra (see next post), to compare with a P4C (without SMT)
XP Dhrystone integer IPC = 8404/2200 = 3.82
XP Whetstone floating point IPC = 3465/2200 = 1.575
P4C Dhrystone integer IPC = 7869/3200 = 2.459
P4C Whetstone floating point IPC = 2365/3200 = 0.739 (w/o SSE2)
P4C Whetstone floating point IPC = 4325/3200 = 1.352 (w/ SSE2)
Ratio between XP to P4C:
Dhyrstone integer IPC = 1.55:1
Whetstone floating point IPC = 2.13:1 (w/o SSE2), 1.16:1 (w/ SSE2)
That is, for executing codes specified in each of the benchmarks,
e.g. comparing an XP with a P4C w/ 2 SMT,
- For Dhyrstone integer arithmetic, 100 AMD XP cycles will do the same computation as about 122 Intel P4 cycles
- For Whetstone floating point arithmetic, 100 AMD XP cycles will do the same computation as about 121 Intel P4C cycles (2 SMT, w/o SSE2), or 100 AMD XP cycles for 69 P4C cycles (2 SMT + SSE2),
In summary,
-
1 AMD Hz = 1.22 P4C Hz (2 SMT) for integer arithmetic (based on Dhrystone benchmark)
-
1 AMD Hz = 1.21 P4C Hz (2 SMT) for floating point arithmetic (based on Whetstone benchmark)
-
1 AMD Hz = 0.69 P4C Hz (2 SMT) for SSE2 floating point arithmetic (based on Whetstone benchmark)
Example,
- A XP/Barton running at 2.5 GHz is as fast as a P4C 3.1 GHz (= 2.5 x 1.22) running in for integer computation in terms of raw CPU power.
- A 2.8 GHz Barton would perform about the same as a 3.4 GHz P4 in integer arithmetic.
The benchmark codes are usually CPU/cache intensive to test CPU and require little or no memory access.
The IPC numbers vary with CPU architechture, so XP/Barton will be different from A64, ...., and can be measured accordingly.
How to interpret Sandra CPU benchmark, IPC and comparing with P4 (page 2)
What is cycle time and frequency
Frequency, clock, period of synchronous operations, latency
Analogy on Bus Speed, Bandwidth and Latency
This link shows the screen shots for the Sandra run used for the IPC calculation in the last post.
Overclocking a mobile Barton 2400+ to 2.6/2.7+ GHz on air (page 18)