Apple Microarchitecture Research by Dougall Johnson

M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions
M1/A14 E-core (Icestorm):  Overview | Base Instructions | SIMD and FP Instructions

STR (pre-index, 64-bit)

Test 1: uops

Code:

  str x0, [x6, #8]!

(no loop instructions)

1000 unrolls and 1 iteration

Retires: 1.000

Issues: 2.000

Integer unit issues: 1.000

Load/store unit issues: 1.000

SIMD/FP unit issues: 0.000

retire uop (01)cycle (02)03mmu table walk data (08)l2 tlb miss data (0b)1e1f20223a3e3f4046494f51schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)6d6emap stall dispatch (70)map rewind (75)map stall (76)dispatch uop (78)map ldst uop (7d)map ldst uop inputs (80)8283flush restart other nonspec (84)85inst all (8c)inst int store (96)inst ldst (9b)l1d tlb access (a0)l1d tlb miss (a1)l1d cache miss st (a2)l1d cache miss ld (a3)ld unit uop (a6)st unit uop (a7)l1d cache writeback (a8)aaabacafbcl1d cache miss st nonspec (c0)cfd5map dispatch bubble (d6)ddfetch restart (de)e0? int output thing (e9)? ldst retires (ed)? int retires (ef)f5f6f7f8fd
10051040800641407010257702252000100010001000100050778458241040104082438982000100020001040124111001100010001016040227100011410010120317341633103710001000100010411041104110411041
1004104070004241501025900225200010001000100010005077845824104010408243898200010002000104012411100110001000101203121710000146010140477331633103710001000100010411041104110411041
10041040700641817010252760325200010001000100010005077845824104010408243898200010002000104012411100110001000102405532510020226410200317331633103710001000100010411041104110411041
10041040800124160001025200122520001000100010001000507784582410401040824389820001000200010401241110011000100010166472191001006010000717331633103710001000100010411041104110411041
1004104081004141501025951325200010001000100010005077845824104010408243898200010002000104012411100110001000101404721410001188310000317331633103710001000100010411041104110411041
100410407000500624102511003252000100010001000100050778458241040104082438982000100020001040124111001100010001000063315101201814310140517331633103710001000100010411041104110411041
1004104080103200501025956325200010001000100010005077845824104010408243898200010002000104012411100110001000100004729100013012010140317331633103710001000100010411041104110411041
1004104070004161520102590832520001000100010001000507784582410401040824389820001000200010401241110011000100010000392151003000010000397331633103710001000100010411041104110411041
10041040800041414010258002252000100010001000100050778458241040104082438982000100020001040124111001100010001012031317100001216010160317331633103710001000100010411041104110411041
100410408002440070102511072252000100010001000100050778458241040104082438982000100020001040124111001100010001014023212100001412010140557331633103710001000100010411041104110411041

Test 2: Latency 2->2

Code:

  str x0, [x6, #8]!

(fused SUBS/B.cc loop)

100 unrolls and 100 iterations

Result (median cycles for code): 1.0040

retire uop (01)cycle (02)03l1d tlb fill (05)mmu table walk data (08)l2 tlb miss data (0b)1e1f2022293a3c3e3f4046494f51schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)696a6d6emap stall dispatch (70)map rewind (75)map stall (76)dispatch uop (78)map int uop (7c)map ldst uop (7d)map int uop inputs (7f)map ldst uop inputs (80)8283flush restart other nonspec (84)85inst all (8c)inst branch (8d)inst branch taken (90)inst branch cond (94)inst int store (96)inst int alu (97)inst ldst (9b)9fl1d tlb access (a0)l1d tlb miss (a1)l1d cache miss st (a2)l1d cache miss ld (a3)a4ld unit uop (a6)st unit uop (a7)l1d cache writeback (a8)a9aaabacafbcl1d cache miss st nonspec (c0)l1d tlb miss nonspec (c1)c2branch cond mispred nonspec (c5)branch mispred nonspec (cb)cdcfd5map dispatch bubble (d6)ddfetch restart (de)e0? int output thing (e9)eaeb? ldst retires (ed)? int retires (ef)f5f6f7f8fd
1020910040754402139105809177673012810025803898854252010010100100001010610000521991468824496960100401004086816874320106200100082002001610040122111020110099100100001001000010010952281649374068610279256089838898109144012292801117170160010037100001010000101001004110041100411004110041
1020410040754002202124788172093015610025778917753252010010100100001010610000522055468824496960100401004086817874220106200100082002001610040122111020110099100100001001000010010926241159385268310279286089842903109613411732101117180160010037100004010000101001004110041100411004110041
10204100407530021009880318007909610025792838946252010010100100001010610000522093468824496960100401004086816874220106200100082002001610040122111020110099100100001001000010010954211373376066110264259092448948108772910762101117170160010037100007010000101001004110041100411004110041
102041004075300225097847176869014410025800809151252010010100100001010010000522069468824496960100401004086743874720100200100002002000010040122111020110099100100001001000010010917241192388069210249281492844831109344111792800007101171110037100002010000101001004110041100411004110041
1020410040754402322106822172077310410025800758464252010010100100001010010000521977468824496960100401004086743874720100200100002002000010040122111020110099100100001001000010010954281385393071910274278093742868109193911122800007101171110037100002010000101001004110041100411004110041
1020410040754002493117827172089016810025770868446252010010100100001010010000521971468824496960100401004086743874720100200100002002000010040122111020110099100100001001000010010933281351390067010260297889734873108874611082840007101171110037100004110000101001004110041100411004110041
102041004075300219393800176077015210025804839040252010010100100001010010000522085468824496960100401004086743874720100200100002002000010040122111020110099100100001001000010010890241231428067410270276389742910108803812502100007101171110037100000010000101001004110041100411004110041
1020410040753002169109835172886014810025759519757252010010100100001010010000522033468824496960100401004086743874720100200100002002000010040122111020110099100100001001000010010924241260384065610268289391832890109222611171420007101171110037100003010000101001004110041100411004110041
1020410040752002145107835178482014810025773707753252010010100100001010010000522013468824496960100401004086743874720100200100002002000010040122111020110099100100001001000010010878141344381066210236272088238935109522611211400007101171110037100001010000101001004110041100411004110041
102041004075202216990831168879035610025787947850252010010100100001010010000522043468824496960100401004086743874720100200100002002000010040122111020110099100100001001000010010904161366389068010275270088840822109032911381400007101171110037100002010000101001004110041100411004110041

1000 unrolls and 10 iterations

Result (median cycles for code): 1.0040

retire uop (01)cycle (02)03l1d tlb fill (05)mmu table walk data (08)l2 tlb miss data (0b)1e1f2022293a3c3e3f4046494f51schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)696a6d6emap stall dispatch (70)map rewind (75)map stall (76)dispatch uop (78)map int uop (7c)map ldst uop (7d)map int uop inputs (7f)map ldst uop inputs (80)8283flush restart other nonspec (84)85inst all (8c)inst branch (8d)inst branch taken (90)inst branch cond (94)inst int store (96)inst int alu (97)inst ldst (9b)9fl1d tlb access (a0)l1d tlb miss (a1)l1d cache miss st (a2)l1d cache miss ld (a3)a4ld unit uop (a6)st unit uop (a7)l1d cache writeback (a8)a9aaabacafbcl1d cache miss st nonspec (c0)l1d tlb miss nonspec (c1)c2cfd5map dispatch bubble (d6)ddfetch restart (de)e0? int output thing (e9)ea? ldst retires (ed)? int retires (ef)f5f6f7f8fd
1002910040751112274918141744801188100257911228070252001010010100001001010000520809468824496960100401004086963877020010201000020200001004012411100211091010000101000010109509124338926581024727208924082010927211143140640316331003710000010000100101004110041100411004110041
10024100407510022298777017846901521002579494876025200101001010000100101000052107346882449696010040100408696387702001020100002020000100401241110021109101000010100001010877714223580683102592892906368921092620111371640316331003710000010000100101004110041100411004110041
100241004075110227787807172872910810025780906360252001010010100001001010000521049468824496960100401004086963877020010201000020200001004012411100211091010000101000010109141412473801269310255256290217289510940291257140640316221003710000210000100101004110041100411004110041
100241004075220228010883016887710112100258175078792520010100101000010010100005209134688244969601004010040869638770200102010000202000010040124111002110910100001010000101086814130036406971026930308645093710902291206140640316331003710000010000100101004110041100411004110041
100241004075200230410779516808010152100257915072612520064100391000010010100005210494688244969601004010040869638770200102010000202000010040124111002110910100001010000101090614127338226631026730109164697910896251092142640216331003710000010000100101004110041100411004110041
100241004075202226210682117447510152100258608262882520010100101000010010100005210174688244969601004010040869638770200102010000202000010040124111002110910100001010000101090814132633206801025428649364295610960241251142640316331003710000010000100101004110041100411004110041
10024100407622021391087661664810112100258157060742520010100101000010010100005208574688244969601004010040869638770200102010000202000010040124111002110910100001010000101092414125336366431026231228963683810918281162140640316331003710000210000100101004110041100411004110041
10024100407522022809576517208010132100257776480682520010100101000010010100005208734688244969601004010040869638770200102010000202000010040124111002110910100001010000101089414131138906781028527709184692510893291226140640316321003710000010000100101004110041100411004110041
10024100407520022231027861704670152100257894152622520010100101000010010100005210174688244969601004010040869638770200102010000202000010040124111002110910100001010000101090218132136206561027328528828289610917311163140640316331003710000010000100101004110041100411004110041
1002410040752022085998011736710120100258015564632520010100101000010010100005209554688244969601004010040869638770200102010000202000010040124111002110910100001010000101090216130241406481025526309064683810915291206140640216331003710000010000100101004110041100411004110041

Test 3: throughput

Count: 8

Code:

  str x0, [x6, #8]!
  str x0, [x7, #8]!
  str x0, [x8, #8]!
  str x0, [x9, #8]!
  str x0, [x10, #8]!
  str x0, [x11, #8]!
  str x0, [x12, #8]!
  str x0, [x13, #8]!
  mov x7, x6
  mov x8, x6
  mov x9, x6
  mov x10, x6
  mov x11, x6
  mov x12, x6
  mov x13, x6

(fused SUBS/B.cc loop)

100 unrolls and 100 iterations

Result (median cycles for code divided by count): 0.5099

retire uop (01)cycle (02)03l1d tlb fill (05)mmu table walk data (08)l2 tlb miss data (0b)18191e1f2022293a3e3f4046494f51schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)6067696a6d6emap stall dispatch (70)map rewind (75)map stall (76)dispatch uop (78)map int uop (7c)map ldst uop (7d)map int uop inputs (7f)map ldst uop inputs (80)8283flush restart other nonspec (84)85inst all (8c)inst branch (8d)inst branch taken (90)inst branch cond (94)inst int store (96)inst int alu (97)inst ldst (9b)9fl1d tlb access (a0)l1d tlb miss (a1)l1d cache miss st (a2)l1d cache miss ld (a3)a4ld unit uop (a6)st unit uop (a7)l1d cache writeback (a8)a9aaabacafbcl1d cache miss st nonspec (c0)l1d tlb miss nonspec (c1)c2cfd5map dispatch bubble (d6)ddfetch restart (de)e0? int output thing (e9)? ldst retires (ed)? int retires (ef)f5f6f7f8fd
8020940514306440001959782772168812514040688780165115011822516395380432800008010080000401945187358014474937755408294077730706263084916010020080000200160000407709311802011009910080000100800001008087125413550518883805442910937401747813925593766356511011711408168071280000801004097440852407834080540820
80204408133063030019838448111664112132408307651828169018125162521847568000080100800004020491875044126349377474078740824307263307151601002008000020016000040760861180201100991008000010080000100809613438145037850805162883879841641812855243526420511011711408138055680000801004082340725407844086440817
8020440775306330001896739816168811080406587731820199011225160588830468000080100800004021911877732130649376634088240899306353308281601002008000020016024040861861180201100991008000010080000100809324440264489848806072623843401592813885084072426511011611406648290680000801004074940879409254073140717
80204407943073300018068397721664104924083373117431583142251607108331080000801008000040251018748781261949377604082440739306783307381601002008000020016000040774851180201100991008000010080000100809054438454617842805752716878401717813865353788210511011711408548075180000801004086940931407374075940923
802044077830530000183078181616961071244082073916601845160251604008187180018801008000040231118776601125449377834076540796306623308711601002008000020016000040714861180201100991008000010080000100808862641274909860805543000857441471814375253671356511011611408818390380000801004078240744408624073340844
802044084530530000201694678516641231324076176517231752136251604278080880006801008000040248618768451855493773240784408903077233082316010020080000200160000407808611802011009910080000100800001008093926444847223872805672720881421640814095503554420511011711407968218580000801004076840855407704081940790
802044081030530000181282577516561151324079278917191764112251605948086580000801008000040740518787641181849377714084040805306953308471601002008000020016000040771921180201100991008000010080000100809073546454795856804842910847381608813465364002350511011611407978050080000801004088140886408434073640897
80204407913063030018429547771656128120407967711870173016525160769808098000080100800004116401873198119249375784080240761306633306861601002008000020016000040803861180201100991008000010080000100809414439704356855805322990831821519813635653991400511011611406558344280000801004078640830407034079440784
802044082730533300174689579516801021244073077417461812132251606858679780000801008000040224718786681436493768340700407323070733078116010020080000200160000407799211802011009910080000100800001008088540408545414909805713302920401701814585223907272511011711408268198080000801004072640708407424083040843
80204407883052200018728658061632117124406987641897172411325165349828008000080100800004021721876724128149376624073940759306273307981601002008000020016000040742851180201100991008000010080000100808702739834785868805162810889401596813655414554260511011711408398032080000801004074340738408234076240762

1000 unrolls and 10 iterations

Result (median cycles for code divided by count): 0.5101

retire uop (01)cycle (02)03l1d tlb fill (05)mmu table walk data (08)09l2 tlb miss data (0b)18191e1f2022293a3e3f4046494f51schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)6067696a6d6emap stall dispatch (70)map rewind (75)map stall (76)dispatch uop (78)map int uop (7c)map ldst uop (7d)map int uop inputs (7f)map ldst uop inputs (80)8283flush restart other nonspec (84)85inst all (8c)inst branch (8d)inst branch taken (90)inst branch cond (94)inst int store (96)inst int alu (97)inst ldst (9b)9fl1d tlb access (a0)l1d tlb miss (a1)l1d cache miss st (a2)l1d cache miss ld (a3)a4ld unit uop (a6)st unit uop (a7)l1d cache writeback (a8)a9aaabacafbcl1d cache miss st nonspec (c0)l1d tlb miss nonspec (c1)c2c3cfl1i cache miss demand (d3)d5map dispatch bubble (d6)ddfetch restart (de)e0? int output thing (e9)? ldst retires (ed)? int retires (ef)f5f6f7f8fd
80029406813050000001602710788161612552409707711752184322625160369806678009580010800004015741880776026449377114081340731307743307021600102080000201600004079675118002110910800001080000108086983950481387980569268186636156381429500416613105020031732409678014480000800104073140804407044079140834
800244078730610000017857807781688106724090180216891780188251651558049580101800108000040907318779440145349376274083640794307543307501600102080000201600004076675118002110910800001080000108087404518466168978054829808443616048138054039450005020031623408228068580000800104073240848407304083740779
80024407453060000001776874812170410114040817753147315941412516054580532800858001080000402122187590402234937627407444077630836330779160010208000020160000407167511800211091080000108000010808641438444921286280534277187736150481416458435614005020031723407838058980000800104083740880408084076240708
800244080630510110017977668051680125148407227721988169720425160655832648010480010800004014551879408053649378164068840820307923308141600102080000201600004083575118002110910800001080000108087403934472128418048326108713415288143650746870005020031833407278651380000800104079740766407564080340802
80024408403050000001797750751172811612040808804151718351352516055685215800318001080000400836187715202664937693408434067630646330781160010208000020160000408827511800211091080000108000010809361343544893852805142570878126153881363557396314005020031632408398067780000800104081640891408034086840964
800244074730611000017137637801624117104408997941846174516225166016880098006280010800004029111873192014549377084080440770307723307601600102080000201600004076076118002110910800001080000108090474266503791380490277186340164481403539375313005020031632407268051080000800104090040837408524083040766
80024408383061110001881739784168011712840679780160616181772516384680457800008001080000410421187422402894937707407844084830868330726160010208000020160000406997511800211091080000108000010808740436545748408056126108604016298140051145560005020231833408398063880000800104073240810408494084840842
80024407883060000001902786784167212912440776793179117931782516056983740800438001080000413639187720003864937736407674072130765330842160010208000020160000407197511800211091080000108000010809101244974691084680527272190540159881394532350714005020031633407728320680000800104071640818407904081340863
800244076930611110017647657921632119108408177781894138017725160683810198000080010800004080981873096067149377674087940810307963308611600102080000201600004080675118002110910800001080000108084815456746610858805412830841154157481389526396714105020031732407948044980000800104084640812408144077640791
800244085730500000016027667981624102100407348161918160117725160679895308000080010800004024521875496018794937731408334083130723330788160010208000020160000408967511800211091080000108000010808911441795131184380572259185640161581416511486113005020031623407778057480000800104073340871407204072940874