Apple Microarchitecture Research by Dougall Johnson

M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions
M1/A14 E-core (Icestorm):  Overview | Base Instructions | SIMD and FP Instructions

STRB (pre-index)

Test 1: uops

Code:

  strb w0, [x6, #8]!

(no loop instructions)

1000 unrolls and 1 iteration

Retires: 1.000

Issues: 2.000

Integer unit issues: 1.000

Load/store unit issues: 1.000

SIMD/FP unit issues: 0.000

retire uop (01)cycle (02)03l1d tlb fill (05)mmu table walk data (08)09l2 tlb miss data (0b)1e1f20223a3e3f4046494f51schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)60616d6emap stall dispatch (70)map rewind (75)map stall (76)dispatch uop (78)map ldst uop (7d)map ldst uop inputs (80)8283flush restart other nonspec (84)85inst all (8c)inst int store (96)inst ldst (9b)l1d tlb access (a0)l1d tlb miss (a1)l1d cache miss st (a2)l1d cache miss ld (a3)ld unit uop (a6)st unit uop (a7)l1d cache writeback (a8)a9aaabacafbcl1d cache miss st nonspec (c0)l1d tlb miss nonspec (c1)c2cfd0d5map dispatch bubble (d6)ddfetch restart (de)e0? int output thing (e9)? ldst retires (ed)? int retires (ef)f5f6f7f8fd
10051040700001241517010258005252000100010001000100050778458241010401040824389820001000200010401241110011000100010200453131001102012610127397073031633103710001000100010411041104110411041
10041040700001241419410259222252000100010001000100050778458241010401040824389820001000200010401241110011000100010380633171000201812010140470073031633103710001000100010411041104110411041
10041040800006414152010250022252000100010001000100050778458241010401040824389820001000200010401241110011000100010180553161000101412310160630073031633103710001000100010411041104110411041
10041040800002161616010250023252000100010001000100050778458241010901040824389820001000200010401241110011000100010320644151001101412310001310073031633103710001000100010411041104110411041
100410407000004201701025925225200010001000100010005077845824101040104082438982000100020001040124111001100010001033087720100010166010150390073031633103710001000100010411041104110411041
1004104070000041414010251005225200010001000100010005077845824101040104082438982000100020001040124111001100010001036055216100110166310000550073031633103710001000100010411041104110411041
100410408000066141501025902225200010001000100010005077845824101040104082438982000100020001040124111001100010001018060216100010146310180390073031633103710001000100010411041104110411041
1004104080000041415010251152225200010001000100010005077845824101040104082438982000100020001040124111001100010001039878514100710120710007397173031633103710001000100010411041104110411041
100410408111101019140102512226252000100010001000100050762458241010401040824389820001000200010401241110011000100010179470131007101361010137397273031633103710001000100010411041104110411041
1004104081110010131601025104152520001000100010001000507624582410104010408243898200010002000104012411100110001000104010635110070219121010167477073031633103710001000100010411041104110411041

Test 2: Latency 2->2

Code:

  strb w0, [x6, #8]!

(fused SUBS/B.cc loop)

100 unrolls and 100 iterations

Result (median cycles for code): 1.0040

retire uop (01)cycle (02)03l1d tlb fill (05)mmu table walk data (08)l2 tlb miss data (0b)18191e1f20222324293a3c3e3f4046494f51schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)696a6d6emap stall dispatch (70)map rewind (75)map stall (76)dispatch uop (78)map int uop (7c)map ldst uop (7d)map int uop inputs (7f)map ldst uop inputs (80)8283flush restart other nonspec (84)85inst all (8c)inst branch (8d)inst branch taken (90)inst branch cond (94)inst int store (96)inst int alu (97)inst ldst (9b)9fl1d tlb access (a0)l1d tlb miss (a1)l1d cache miss st (a2)l1d cache miss ld (a3)a4ld unit uop (a6)st unit uop (a7)l1d cache writeback (a8)a9aaabacafbcl1d cache miss st nonspec (c0)l1d tlb miss nonspec (c1)c2branch cond mispred nonspec (c5)branch mispred nonspec (cb)cdcfd5map dispatch bubble (d6)ddfetch restart (de)e0? int output thing (e9)eaeb? ldst retires (ed)? int retires (ef)f5f6f7f8fd
102091004075200002010119841100776731921002581811110657252010010100100001010010000522067468824496960100401004086743874720100200100002002000010040122111020110099100100001001000010010920281387351066910275254092640870109483511032100007101171110037100002010000101001004110041100411004110041
1020410040753000022111208351007768501081002580511510657252010010100100001010010000522047468824496960100401004086743874720100200100002002000010040122111020110099100100001001000010010906281245369068610287261488836910109123811522800007101171110037100003010000101001004110041100411004110041
102041004075440002061110806100768821156100257811139952252010010100100001010010000521987468824496960100401004086743874720100200100002002000010040123111020110099100100001001000010010930281285376065910281282089432882109133411922100007101171110037100000010000101001004110041100411004110041
10204101467540000207682813100408601241002581496124612520100101001000010100100005220754688244969601004010040867468747201002001000020020000100401221110201100991001000010010000100109302913973870673102872514932429001088340125221120007101171110037100003010000101001004110041100411004110041
10204100407540020224110480810078475016010025798848844252010010100100001010010000522013468824496960100401004086743874720100200100002002000010040123111020110099100100001001000010010918281248376069810275277091036940109134111222800007101171110037100007010000101001004110041100411004110041
102041004075400001947116790100744832112100258061069154252010010100100001010010000522035468824496960100401004086743874720100200100002002000010040122111020110099100100001001000010010927281271375067810279301091032854109353510972800007101171110037100001010000101001004110041100411004110041
10204100407540400239711184510076084010010025773939553252010010100100001010010000522115468824496960100401004086743874720100200100002002000010040122111020110099100100001001000010010921241341359066210287309090232983109403711681430007101171110037100001010000101001004110041100411004110041
10204100407533000217510578610077681211210025756918164252010010100100001010010000522005468824496960100401004086743874720100200100002002000010040122111020110099100100001001000010010899211411339065810266250090832907109032910991460007101171110037100002010000101001004110041100411004110041
10204100407530300211511079810074485310010025820839055252010010100100001010010000522083468824496960100401004086743874720100200100002002000010040123111020110099100100001001000010010964211339382064410287266390434896109442411582100007101171110037100007010000101001004110041100411004110041
102041004075333002094116816100792900112100257848810460252010010100100001010010000522003468824496960100401004086743874720100200100002002000010040123111020110099100100001001000010010927211263381068810250292092032983109042811272100007101171110037100001010000101001004110041100411004110041

1000 unrolls and 10 iterations

Result (median cycles for code): 1.0040

retire uop (01)cycle (02)03l1d tlb fill (05)mmu table walk data (08)09l2 tlb miss data (0b)1e1f2022293a3c3e3f4046494f51schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)60696a6d6emap stall dispatch (70)map rewind (75)map stall (76)dispatch uop (78)map int uop (7c)map ldst uop (7d)map int uop inputs (7f)map ldst uop inputs (80)8283flush restart other nonspec (84)85inst all (8c)inst branch (8d)inst branch taken (90)inst branch cond (94)inst int store (96)inst int alu (97)inst ldst (9b)9fl1d tlb access (a0)l1d tlb miss (a1)l1d cache miss st (a2)l1d cache miss ld (a3)a4ld unit uop (a6)st unit uop (a7)l1d cache writeback (a8)a9aaabacafbcl1d cache miss st nonspec (c0)l1d tlb miss nonspec (c1)c2cfd5map dispatch bubble (d6)ddfetch restart (de)e0? int output thing (e9)ea? ldst retires (ed)? int retires (ef)f5f6f7f8fd
1002910040752220212187811175292010010025805681125125200101001010000100101000052104946882404969601004010040869638770200102010000202000010040124111002110910100001010000101094421133039407241027126609283682510908291141140640316331003710000010000100101004110041100411004110041
1002410040761000205894828172085010810025823978548252001010010100001001010000521049468824149696010040100408696387702001020100002020000100401241110021109101000010100001010907713033710715102662412958467541093118126772640216331003710000110000100101004110041100411004110041
100241004075100118961168501728901116100257888996572520010100101000010010100005210594688241496960100401004086963877020010201000020200001004012411100211091010000101000010109231411613730661102612880924468131094529122670640316331003710000010000100101004110041100411004110041
10024100407611102094888111768830120100258291119763252001010010100001001010000521049468824149696010040100408696387702001020100002020000100401241110021109101000010100001010889712713800672102772820905428521089318117970640216331003710000210000100101004110041100411004110041
10024100407510002001868321760842136100258171028941252001010010100001001010000521073468824049696010040100408696387702001020100002020000100401241110021109101000010100001010950714043350692102642672909368381096321123970640316231003710000110000100101004110041100411004110041
1002410040751011224789805178490311210025819828043252001010010100001001010000521017468824049696010040100408696387702001020100002020000100401241110021109101000010100001010904913253720683102682810927508721092022131471640316321003710000110000100101004110041100411004110041
1002410040751110220590868178489011610025827949247252001010010100001001010000521041468824049696010040100408696387702001020100002020000100401241110021109101000010100001010965713153470671102492740945367651094117127470640316231003710000110000100101004110041100411004110041
1002410040751000214883838171284016410025815716034252001010010100001001010000521041468824049696010040100408696387702001020100002020000100401241110021109101000010100001010921712983520707102702711935469191095820106772640316231003710000210000100101004110041100411004110041
1002410040751000212189825168071011610025832996549252001010010100001001010000521049468824049696010040100408696387702001020100002020000100401241110021109101000010100001010923713433530699102602630888369131093118126370640216331003710000010000100101004110041100411004110041
1002410040751000209491814174490012410025796726248252001010010100001001010000521041468824149696010040100408696387702001020100002020000100401241110021109101000010100001010932713373610671102482871926368231093717110570640316321003710000310000100101004110041100411004110041

Test 3: throughput

Count: 8

Code:

  strb w0, [x6, #8]!
  strb w0, [x7, #8]!
  strb w0, [x8, #8]!
  strb w0, [x9, #8]!
  strb w0, [x10, #8]!
  strb w0, [x11, #8]!
  strb w0, [x12, #8]!
  strb w0, [x13, #8]!
  mov x7, x6
  mov x8, x6
  mov x9, x6
  mov x10, x6
  mov x11, x6
  mov x12, x6
  mov x13, x6

(fused SUBS/B.cc loop)

100 unrolls and 100 iterations

Result (median cycles for code divided by count): 0.5098

retire uop (01)cycle (02)03l1d tlb fill (05)0918191e1f2022293a3e3f4046494f51schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)6067696a6d6emap stall dispatch (70)map rewind (75)map stall (76)dispatch uop (78)map int uop (7c)map ldst uop (7d)map int uop inputs (7f)map ldst uop inputs (80)8283flush restart other nonspec (84)85inst all (8c)inst branch (8d)inst branch taken (90)inst branch cond (94)inst int store (96)inst int alu (97)inst ldst (9b)9fl1d tlb access (a0)l1d cache miss st (a2)l1d cache miss ld (a3)a4ld unit uop (a6)st unit uop (a7)l1d cache writeback (a8)a9aaabacafbcl1d cache miss st nonspec (c0)c2cfd5map dispatch bubble (d6)ddfetch restart (de)e0? int output thing (e9)? ldst retires (ed)? int retires (ef)f5f6f7f8fd
80209407073050000182496379516561231124073776015451770137251609948039780004801008000041285318774400183249377444078540772307283306921601002008000020016000040736751180201100991008000010080000100808724193474690680570238086413015708134158339820511011611407668047780000801004083940739408194081240704
80204408243050000179774778516881059240782768185816032292516080780496800008010080000419238187297612604937641408044081030586330698160100200800002001600004082475118020110099100800001008000010080886376145628558055226908842216008134551441760511011711407028063480000801004083740861406964087040797
8020440861306000016086957921656981044075678217261711109251603878086980015801008000040321118761920170249376514080640793306753307081601002008000020016000040803761180201100991008000010080000100808574113491178508056523208756614838143549140270511011711407058089880000801004069840729407874076740683
80204407263050000179188076316729614440805771150916331532516708181429800008010080000418442187837602674937686408434064330733330718160100200800002001600004075375118020110099100800001008000010080878390846168488052124508783215268142146041920511011611408558434980000801004079140835407374079240749
8020440812305000016688427761688114136407217671561166517425160618858518001480100800004089371867300021724937754407644085230706330787160100200800002001600004081675118020110099100800001008000010080892418143898558054224608683413028140452946420511011711407098070380000801004077840966407304081440688
802044092630600001554815779172811213640738768179119171812516410180630800008010080000402494187516002964937559407494078230589330730160100200800002001600004083775118020110099100800001008000010080871459046168908045321908919815658141155645660511011711407948055180000801004079740800407334077840721
802044078630500001431700775168812610440680755179117991822516073683319800008010080000401626187715213904937737407944075530557330763160100200800002001600004070875118020110099100800001008000010080871381750138658049626208592614218135651741990511011611407318111980000801004087840880408014072640796
8020440795306000020346687901688108136408627521695151717125160327869798003080100800004024601874776053149375684080340704307383307731601002008000020016000040796751180201100991008000010080000100808554347452118448051727908526415458142445342240511011711408338105280000801004078240825406274071440816
8020440781306000016298017811696130964076976516271918110251682798318280000801008000041364918752801306493771440766407503067733075216010020080000200160000406667611802011009910080000100800001008088342824611283980510259085912615068131652440140511011611406858272880000801004070641321407424073540769
8020440782305000018907287851680119156408487651732163618825161141826618000080100800004048611877344018934937663407494071530543330750160100200800002001600004081575118020110099100800001008000010080833400148588738052426008413015248136946938630511011711408358038980000801004074940738407184086440836

1000 unrolls and 10 iterations

Result (median cycles for code divided by count): 0.5093

retire uop (01)cycle (02)03l1d tlb fill (05)mmu table walk data (08)l2 tlb miss data (0b)1e1f2022293a3e3f4046494f51schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)6067696a6d6emap stall dispatch (70)map rewind (75)map stall (76)dispatch uop (78)map int uop (7c)map ldst uop (7d)map int uop inputs (7f)map ldst uop inputs (80)8283flush restart other nonspec (84)85inst all (8c)inst branch (8d)inst branch taken (90)inst branch cond (94)inst int store (96)inst int alu (97)inst ldst (9b)9fl1d tlb access (a0)l1d tlb miss (a1)l1d cache miss st (a2)l1d cache miss ld (a3)a4ld unit uop (a6)st unit uop (a7)l1d cache writeback (a8)a9aaabacafbcl1d cache miss st nonspec (c0)l1d tlb miss nonspec (c1)c2cfd5map dispatch bubble (d6)dbddfetch restart (de)e0? int output thing (e9)? ldst retires (ed)? int retires (ef)f5f6f7f8fd
800294044730310120943848251720118116403518301973185710025160309803208000080010800004051621859780019949374374037140522304683303871600102080000201600004052992118002110910800001080000108095314409150369208028228409162812138119824746661415020116011406038042780000800104046840375404014036640562
800244051330210019293218421736109140404858201996173074251642948058480100800108000040080518568040221493729740485403813036833049416001020800002016000040401921180021109108000010800001080938154614490129278024628829274212178119224346261415020118011403758031180000800104041440442403894055240404
80024404413021001914388813172010610040402824217420889025160273840588000080010800004103891860016012549373864047540388303143304501600102080000201600004040885118002110910800001080000108093616442448958858028428409373412748113525736811405020117011405748543680000800104046940424404634032040367
800244046130211119383898051792120964050379420511946108251668688031680053800108000040122818564690226493736040427404833028233046716001020800002016000040470931180021109108000010800001080931144711494109258028830909096613108116027049961405020118011404668119180000800104043940481404624048040462
8002440425302100195939181417601149640381797198918978125164849803688000080134800004104011857524014549373094042840463303143304031600102080000201600004041992118002110910800001080000108090914412353079008029128509102812228117928746051415020116011405118020180000800104042340461408394041940551
8002440448302110207937782117281151244054578418822127992516032480198800008001080000401691186061209349373184039140423304803304781600102080000201600004041091118002110910800001080000108091714407152059028029029409493611668117125544431405020118011403958449480000800104032940412404094044740356
8002440432302111200432782117441021444059978519942137119251603868043180043800108000040225718596240221493742140410403943037133046716001020800002016000040550901180021109108000010800001080991144498489129398025928009123812608118522944971415020117011404678040480000800104041540431405014045340466
800244043930210019864408321712121116403798222179204610325160920806228005380010800004007431856324030049374424040240397303653303811600102080000201600004044392118002110910800001080000108093515470248779198022429009334612158120128442911405020117011403698024980000800104038840476404114053740433
800244045830311019863608371776122112404447852183202488251696148420280000800108000040121618583960125949373584044640431303643304551600102080000201600004041387118002110910800001080000108091515413950768778027132309192812138114625844161405020117011406908491280000800104042540485403804039940433
800244037630310120823388201744116136404208231951194910725160508828358000080134800004065371858028048149373464040040415302993303641600102080000201600004042992118002110910800001080000108094314465348598918030030339213412348113025844711405020116011403978631480000800104046740865405014042940435