SHORT SP/DP scalar/vector MFlops/s

EVENTSET
PMC3  PM_FLOP_CMPL
PMC4  PM_RUN_INST_CMPL
PMC5  PM_RUN_CYC

METRICS
Runtime (RDTSC) [s] time
CPI  PMC5/PMC4
SP/DP [MFLOP/s] (scalar assumed) 1.0E-06*PMC3*2.0/time
SP [MFLOP/s] (vector assumed) 1.0E-06*PMC3*8.0/time
DP [MFLOP/s] (vector assumed) 1.0E-06*PMC3*4.0/time

LONG
Formulas:
CPI = PM_RUN_CYC/PM_RUN_INST_CMPL
SP/DP [MFLOP/s] (scalar assumed) = 1.0E-06*PM_FLOP_CMPL*2.0/runtime
SP [MFLOP/s] (vector assumed) = 1.0E-06*PM_FLOP_CMPL*8.0/runtime
DP [MFLOP/s] (vector assumed) = 1.0E-06*PM_FLOP_CMPL*4.0/runtime
--
This group counts floating-point operations. All is derived out of a
single event PM_FLOP_CMPL, so if you have mixed usage of SP or DP and
scalar and vector operations, the count won't be exact. With pure codes
the counts are pretty accurate (e.g. when using likwid-bench).
