- Joined
- Mar 7, 2008
Can I get some volunteers to run a custom Prime95 benchmark on Kaby Lake? I just want to check that KL does have same IPC as Skylake in this use, as there is still one claim otherwise. They have no credibility or evidence so I don't believe it, but I just want to have hard data to know for sure.
Use version 28.10 http://www.mersenne.org/download/
Add the following lines to the top of prime.txt
MinBenchFFT=128
MaxBenchFFT=128
If the file doesn't exist, it will get created after you run it once, but do make sure to exit Prime95 completely before editing otherwise it will overwrite it on close.
The lines fix the benchmark to one FFT size which is small enough to run out of CPU cache, so ram speed doesn't matter. Results should scale ideally with CPU clock. I'm just looking to get some sample points on peak IPC here. One run should be ok, but I would recommend running it a few times and pick the best. Higher throughput is better. It is pretty quick.
Results are saved in results.txt. I just need the CPU model, the clock it is running at during the benchmark, and the "throughput" results line e.g.
Timings for 128K FFT length (4 cpus, 4 workers): 0.57, 0.52, 0.47, 0.45 ms. Throughput: 8017.20 iter/sec.
There may be two lines if you have HT on, if so please post both.
Predicted results based on my Skylake system, assuming KL has same IPC:
GHz iter/sec
4000 8893
4200 9338
4400 9782
4600 10227
4800 10672
5000 11116
Above assumes a quad core. If dual core, halve the expected values. If we get repeatable and different results from these (more than a % or so), that would be interesting and worthy of further investigation.
Use version 28.10 http://www.mersenne.org/download/
Add the following lines to the top of prime.txt
MinBenchFFT=128
MaxBenchFFT=128
If the file doesn't exist, it will get created after you run it once, but do make sure to exit Prime95 completely before editing otherwise it will overwrite it on close.
The lines fix the benchmark to one FFT size which is small enough to run out of CPU cache, so ram speed doesn't matter. Results should scale ideally with CPU clock. I'm just looking to get some sample points on peak IPC here. One run should be ok, but I would recommend running it a few times and pick the best. Higher throughput is better. It is pretty quick.
Results are saved in results.txt. I just need the CPU model, the clock it is running at during the benchmark, and the "throughput" results line e.g.
Timings for 128K FFT length (4 cpus, 4 workers): 0.57, 0.52, 0.47, 0.45 ms. Throughput: 8017.20 iter/sec.
There may be two lines if you have HT on, if so please post both.
Predicted results based on my Skylake system, assuming KL has same IPC:
GHz iter/sec
4000 8893
4200 9338
4400 9782
4600 10227
4800 10672
5000 11116
Above assumes a quad core. If dual core, halve the expected values. If we get repeatable and different results from these (more than a % or so), that would be interesting and worthy of further investigation.