Apple Microarchitecture Research by Dougall Johnson

M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions
M1/A14 E-core (Icestorm):  Overview | Base Instructions | SIMD and FP Instructions

BL

Test 1: uops

Code:

  bl .+4

(no loop instructions)

1000 unrolls and 1 iteration

Retires: 1.000

Issues: 0.000

Integer unit issues: 0.001

Load/store unit issues: 0.000

SIMD/FP unit issues: 0.000

retire uop (01)cycle (02)schedule uop (52)schedule int uop (53)map int uop (7c)? int output thing (e9)
100430761110001
100420371110001
100422021110001
100422441110001
100421711110001
100420841110001
100421291110001
100423071110001
100421701110001
100419831110001

Test 2: throughput

Count: 8

Code:

  bl .+4
  bl .+4
  bl .+4
  bl .+4
  bl .+4
  bl .+4
  bl .+4
  bl .+4

(fused SUBS/B.cc loop)

100 unrolls and 100 iterations

Result (median cycles for code divided by count): 1.0492

retire uop (01)cycle (02)schedule uop (52)schedule int uop (53)dispatch int uop (56)int uops in schedulers (59)dispatch uop (78)map int uop (7c)map int uop inputs (7f)? int output thing (e9)? int retires (ef)
8020485350101101100300100802062001100
8020484117101101100300100802062001100
8020483918101101100300100802172001100
8020483935101101100300100802062001100
8020483953101101100300100802012001100
8020483912101101100300100802062001100
8020483969101101100300100802062001100
8020483939101101100300100802012001100
8020483938101101100300100802012001100
8020584142101101100300100802062001100

1000 unrolls and 10 iterations

Result (median cycles for code divided by count): 2.9747

retire uop (01)cycle (02)schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)ldst uops in schedulers (5b)dispatch uop (78)map int uop (7c)map ldst uop (7d)map simd uop (7e)map int uop inputs (7f)? int output thing (e9)? int retires (ef)
8002424158511110100300010800370020110
8002423872911110100300010800210020110
8002423779111110100300010800210020110
8002423835811110100300010800210020110
8002523841011110100300010800210020110
8002423825311110100300010800210020110
8002423824811110100300010800210020110
8002423840911110100300010800210020110
8002423841211110100300010800210020110
8002423984311110100300010800210020110