问题
I have some misconceptions about measuring flops, on Intel architecture, is a FLOP one addition and one multiplication together? I read about this somewhere online and there is no debate that could reject this. I know that FLOP has a different meaning on different types of cpu.
How do I calculate my theoretical peak FLOPS? I am using Intel(R) Core(TM)2 Duo CPU E7400 @ 2.80GHz. What exactly is the relationship between GHz and FLOPS? (even wikipedia's entry on FLOPS does NOT specify how to do this)
I will be using the following methods to measure the actual performance of my computer (in terms of flops): Inner product of two vectors: for two vectors of size N, is the number of flops 2n(n -1) (if one addition or one multiplication is considered to be 1 flop). If not, how should I go about calculating this?
I know there better ways to do so, but I would like to know whether my proposed calculations are right. I read somewhere about LINPACK as a benchmark, but I would still like to know how it's done.
回答1:
As for your 2nd question, the theoretical FLOPS calculation isn't too hard. It can be broken down into roughly:
(Number of cores) * (Number of execution units / core) * (cycles / second) * (Execution unit operations / cycle) * (floats-per-register / Execution unit operation)
A Core-2 Duo has 2 cores, and 1 execution unit per core. an SSE register is 128 bits wide. a float is 32 bits wide so you can store 4 floats per register. I assume the execution unit does 1 SSE operation per cycle. So it should be:
2 * 1 * 2.8 * 1 * 4 = 22.4 GFLOPS
which matches: http://www.intel.com/support/processors/sb/cs-023143.htm
This number is obviously purely theoretical best case performance. Real world performance will most likely not come close to this due to a variety of reasons. It's probably not worth trying to directly correlate flops to actual app runtime, you'd be better off trying out the computations used by your applicaton.
回答2:
This article shows some theory on FLOPS numbers for x86 CPUs. It's only current up to Pentium 4, but perhaps you can extrapolate.
回答3:
A FLOP stands for Floating Point Operation.
It means the same in any architecture that supports floating point operations, and is usually measured as the ammount of operations that can take place in any one second (as in FLOPS; floating point operations per second).
here you can find tools to measure your computer's FLOPS.
回答4:
Intel's data sheets contain GFLOPS numbers and your processor has a claimed 22.4
http://www.intel.com/support/processors/sb/CS-023143.htm
Since your machine is dual core that means 11.2 GFlops per core at 2.8 GHz. Divide this out and you get 4. So Intel claims that their cores can each do 4 FLOPS per cycle.
来源:https://stackoverflow.com/questions/1536867/flops-intel-core-and-testing-it-with-c-innerproduct