SHORT Cache bandwidth in MBytes/s

EVENTSET
FIXC0 INSTR_RETIRED_ANY
FIXC1 CPU_CLK_UNHALTED_CORE
FIXC2 CPU_CLK_UNHALTED_REF
PMC0  L1D_REPLACEMENT
PMC1  L1D_M_EVICT
PMC2  L2_LINES_IN_ALL
PMC3  L2_LINES_OUT_DIRTY_ALL
CBOX0C0 LLC_VICTIMS_M_STATE
CBOX1C0 LLC_VICTIMS_M_STATE
CBOX2C0 LLC_VICTIMS_M_STATE
CBOX3C0 LLC_VICTIMS_M_STATE
CBOX4C0 LLC_VICTIMS_M_STATE
CBOX5C0 LLC_VICTIMS_M_STATE
CBOX6C0 LLC_VICTIMS_M_STATE
CBOX7C0 LLC_VICTIMS_M_STATE
CBOX8C0 LLC_VICTIMS_M_STATE
CBOX9C0 LLC_VICTIMS_M_STATE
CBOX10C0 LLC_VICTIMS_M_STATE
CBOX11C0 LLC_VICTIMS_M_STATE
CBOX12C0 LLC_VICTIMS_M_STATE
CBOX13C0 LLC_VICTIMS_M_STATE
CBOX14C0 LLC_VICTIMS_M_STATE


METRICS
Runtime (RDTSC) [s] time
Runtime unhalted [s] FIXC1*inverseClock
Clock [MHz]  1.E-06*(FIXC1/FIXC2)/inverseClock
CPI  FIXC1/FIXC0
L1 to L2 Load [MBytes/s] 1.0E-06*PMC0*64.0/time
L1 to L2 Evict [MBytes/s] 1.0E-06*PMC1*64.0/time
L1 to L2 bandwidth [MBytes/s] 1.0E-06*(PMC0+PMC1)*64.0/time
L1 to L2 data volume [GBytes] 1.0E-09*(PMC0+PMC1)*64.0
L2 to L3 Load [MBytes/s] 1.0E-06*PMC2*64.0/time
L2 to L3 Evict [MBytes/s] 1.0E-06*PMC3*64.0/time
L2 to L3 bandwidth [MBytes/s] 1.0E-06*(PMC2+PMC3)*64.0/time
L2 to L3 data volume [GBytes] 1.0E-09*(PMC2+PMC3)*64.0
L3 to Memory data volume [MBytes] 1.0E-06*(CBOX0C0+CBOX1C0+CBOX2C0+CBOX3C0+CBOX4C0+CBOX5C0+CBOX6C0+CBOX7C0+CBOX8C0+CBOX9C0+CBOX10C0+CBOX11C0+CBOX12C0+CBOX13C0+CBOX14C0)*64

LONG
Formulas:
L1 to L2 Load [MBytes/s] = 1.0E-06*L1D_REPLACEMENT*64/time
L1 to L2 Evict [MBytes/s] = 1.0E-06*L1D_M_EVICT*64/time
L1 to L2 bandwidth [MBytes/s] = 1.0E-06*(L1D_REPLACEMENT+L1D_M_EVICT)*64/time
L1 to L2 data volume [GBytes] = 1.0E-09*(L1D_REPLACEMENT+L1D_M_EVICT)*64
-
Profiling group to measure L2 cache bandwidth. The bandwidth is computed by the
number of cacheline allocated in the L1 and the number of modified cachelines
evicted from the L1. The group also output total data volume transfered between
L2 and L1. Note that this bandwidth also includes data transfers due to a write
allocate load on a store miss in L1.

