Apple Microarchitecture Research by Dougall Johnson

M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions
M1/A14 E-core (Icestorm):  Overview | Base Instructions | SIMD and FP Instructions

B.cc (taken)

Test 1: uops

Code:

  b.ne .+4

(no loop instructions)

1000 unrolls and 1 iteration

Retires: 1.000

Issues: 1.000

Integer unit issues: 1.000

Load/store unit issues: 0.000

SIMD/FP unit issues: 0.000

retire uop (01)cycle (02)031e3f51schedule uop (52)schedule int uop (53)dispatch int uop (56)int uops in schedulers (59)606d6emap rewind (75)map stall (76)dispatch uop (78)map int uop (7c)map int uop inputs (7f)8283flush restart other nonspec (84)85inst all (8c)inst branch (8d)inst branch taken (90)inst branch cond (94)9fl1d cache writeback (a8)accfd5map dispatch bubble (d6)ddfetch restart (de)e0f5f6f7f8fd
1004308219035251000100010005000119601972318100010001000197620461110011000100010001000001846559958479482186720331947198320172005
10042044150352510001000100050001187620003181000100010002018202011100110001000100010000019224631020445484206720371995211119851973
1004209215035251000100010005000120061820318100010001000194420721110011000100010001000001962534960483418197320671975197920671959
1004205814035251000100010005000119542072318100010001000201618941110011000100010001000001962517930494471195719831853196119111971
1004205214035251000100010005000119821926318100010001000196019121110011000100010001000001924455966466439193918231987197720651979
1004200414035251000100010005000118702164318100010001000201219861110011000100010001000001980520938425466196119851881202719752063
1004197415035251000100010005000119861966318100010001000183219901110011000100010001000001916467946510457199119232011198120392013
1004198415035251000100010005000120701980318100010001000199620461110011000100010001000001958479920463482192319811911203920071911
1004202814035251000100010005000120801970316100010001000200820181110011000100010001000001960490964472449198519992005203719051977
1004204614098251000100010005000119382046318100010001000208619961110011000100010001000001938414974465464193719612071197719711877

Test 2: throughput

Count: 8

Code:

  b.ne .+4
  b.ne .+4
  b.ne .+4
  b.ne .+4
  b.ne .+4
  b.ne .+4
  b.ne .+4
  b.ne .+4

(fused SUBS/B.cc loop)

100 unrolls and 100 iterations

Result (median cycles for code divided by count): 1.0096

retire uop (01)cycle (02)0309l2 tlb miss data (0b)18191e1f3f51schedule uop (52)schedule int uop (53)dispatch int uop (56)int uops in schedulers (59)60696a6b6d6emap rewind (75)map stall (76)dispatch uop (78)map int uop (7c)map int uop inputs (7f)8283flush restart other nonspec (84)85inst all (8c)inst branch (8d)inst branch taken (90)inst branch cond (94)inst int alu (97)9fl1d cache writeback (a8)a9acbranch cond mispred nonspec (c5)branch mispred nonspec (cb)cdcfd5map dispatch bubble (d6)ddfetch restart (de)e0? int retires (ef)f5f6f7f8fd
8020481169605000000218278010580105801074006291497770608077280792610801078020780207807726473011802018010080099801001008010010311180747422671314319807671008077380773807758078180775
8020480764604000012070278010580105801074006290497770008079280772610801078020780207807986473011802018010080099801001008010000011180753339705316321807731008077780775807778077780775
802048077260500000028278014380124801074006291497770608076880770610801078020780207807706474241802018010080099801001008010000011180747314647316313807651008076180767807718076780767
802048077460500000028278010580105801074006290497772408078280790610801078020780207807666472011802018010080099801001008010030011180753312643313318807691008076380765807778076780767
8020480934606000012028278010580105801074006291497772008080280798610801078020780207807646472211802018010080099801001008010000011180743316657318325807771008076581229808138077380767
802048076060500000028278010580105801074006291497769808079680784610801078020780207807806473411802018010080099801001008010000011180749320651315319807671008077580773807738077180777
802048077660500000028278010580105801074006290497769808077480778610801078020780207807706472811802018010080099801001008010000011180755316651309317807931008111880783807698077380773
802048076860500000028278010580105801074006290497769408076480758610801078020780207807786475211802018010080099801001008010000011180753319655315315807711008076980771807698077180775
802048078460500000028278010580105801074006290497770208078080776610801078020780207807706474611802018010080099801001008010000011180767565701321309807731008078380769807818077980773
802048076060511000028278010580105801074006291497769208077080768610801078020780207807826473611802018010080099801001008010000014180771320661319323807751008077980781807798078180799

1000 unrolls and 10 iterations

Result (median cycles for code divided by count): 3.0006

retire uop (01)cycle (02)03mmu table walk instruction (07)l2 tlb miss instruction (0a)1e1f3a3f51schedule uop (52)schedule int uop (53)dispatch int uop (56)int uops in schedulers (59)6061696a6b6d6emap rewind (75)map stall (76)dispatch uop (78)map int uop (7c)map int uop inputs (7f)8283flush restart other nonspec (84)85inst all (8c)inst branch (8d)inst branch taken (90)inst branch cond (94)inst int alu (97)9fl1d cache writeback (a8)acbranch cond mispred nonspec (c5)branch mispred nonspec (cb)cdcfd0d2l1i cache miss demand (d3)d5map dispatch bubble (d6)dadbddfetch restart (de)e0eaec? int retires (ef)f5f6f7f8fd
80024240093179810800913428800118001180012400058004923648602400442400446108001280022800222400482400401180021800108000980010108001000111240021510798421600162079971800042400410010240045240048240047240047240043
8002424004417987700913328800118001180012400058194923696202400442400446108001280022800222400442400421180021800108000980010108001000111240019820800001600160079965800022400410010240045240045240045240045240045
8002424004417988800982128800118001180012400058194923696232400442400446108001280022800222400442400441180021800108000980010108001000111240021820800011600140079968800012400410010240045239807240043240045240045
8002424004417988800979828800118001180012400058194923696402400422396556108001280022800222400442400441180021800108000980010108001000111240021820800001600160079969800022400410010240045240043240045240045240045
80024240044179888009781288001180011800124000581949236964024004424004461080012800228002224004424004411800218001080009800101080010001112400218208000116001600799678000224004116010240045240045240043239901240045
8002424004417988700979928800118001180012400058194923696402400442400446108001280022800222400442400441180021800108000980010108001000111240021820800011600160079968800022400410010240009240045240045240045240045
8002424004417988800979328800118001180012400058194923696402400442400446108001280022800222400422400441180021800108000980010108001003111240021820800011600160079969800022400410010240045240045240043239964240045
8002424004417988800979328800118001180012400058194923696402400442400426108001280022800222400442400441180021800108000980010108001000111240021820800011600160079969800022400410010240043240045240045240045240045
80024240044179810800913428800118001180012400058194923696402400442400446108001280022800222400442400441180021800108000980010108001000141240021820799991600120079969800022402620010240049240043240045240037240047
80024240050179886001114941800118001180012400058194923696402400442400446108001280022800542400442400441180021800108000980010108001010111240021820800011600160079970800022400410010240043240045240045239876240043