Apple Microarchitecture Research by Dougall Johnson

M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions
M1/A14 E-core (Icestorm):  Overview | Base Instructions | SIMD and FP Instructions

STR (post-index, 64-bit)

Test 1: uops

Code:

  str x0, [x6], #8

(no loop instructions)

1000 unrolls and 1 iteration

Retires: 1.000

Issues: 2.000

Integer unit issues: 1.000

Load/store unit issues: 1.000

SIMD/FP unit issues: 0.000

retire uop (01)cycle (02)03l1d tlb fill (05)mmu table walk data (08)09l2 tlb miss data (0b)1e1f2022293a3e3f4046494f51schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)606d6emap stall dispatch (70)map rewind (75)map stall (76)dispatch uop (78)map ldst uop (7d)map ldst uop inputs (80)8283flush restart other nonspec (84)85inst all (8c)inst int store (96)inst ldst (9b)l1d tlb access (a0)l1d tlb miss (a1)l1d cache miss st (a2)l1d cache miss ld (a3)ld unit uop (a6)st unit uop (a7)l1d cache writeback (a8)a9aaabacafbcl1d cache miss st nonspec (c0)l1d tlb miss nonspec (c1)c2cfd5map dispatch bubble (d6)ddfetch restart (de)e0? int output thing (e9)? ldst retires (ed)? int retires (ef)f5f6f7f8fd
1005104080000073610001025160002520001000100010001000507784582411040104082438982000100020001040124111001100010001022056514100000141201014756717311611103710001000100010411041104110411099
1004104081000612141054102516203252000100010001000100050754458240104010408243898200010002000104012411100110001000102285643310070020271008732727311611103710001000100010411041104110411041
100410408111101026106410251231752520001000100010001000507544582411040104082438982000100020001040124111001100010001036748116100801142471014748707311611103710001000100010411041104110411041
1004104081011010160050102561032520001000100010001000507544582401040104082438982000100020001040124111001100010001022746522101220261471000752727311611103710001000100010411041104110411041
100410408110001100060102510035252000100010001000100050746458241104010408243898200010002000104012411100110001000102184011510120101271012748727311611103710001000100010411041104110411041
100410407100101012101121025101242520001000100010001000507544582411040104082438982000100020001040124111001100010001025740122100701121271022852717311611103710001000100010411041104110411041
100410408101001020007010251031425200010001000100010005075445824010401040824389820001000200010401241110011000100010187722161008310071000734707311611103710001000100010411041104110411041
10041040810010100008010258313252000100010001000100050754458241104010408243898200010002000104012411100110001000103184031210081019671028740717311611103710001000100010411041104110411041
100410407110101014207010258324252000100010001000100050762458241104010408243898200010002000104012411100110001000102285652010071220671020848727311611103710001000100010411091104110411041
100410407111101014105010258333252000100010001000100050754458241104010408243898200010002000104012411100110001000102274851210070114271018748717312511103710001000100010411041104110411041

Test 2: Latency 2->2

Code:

  str x0, [x6], #8

(fused SUBS/B.cc loop)

100 unrolls and 100 iterations

Result (median cycles for code): 1.0040

retire uop (01)cycle (02)03l1d tlb fill (05)1e1f2022293a3e3f404446494f51schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)696a6d6emap stall dispatch (70)map rewind (75)map stall (76)dispatch uop (78)map int uop (7c)map ldst uop (7d)map int uop inputs (7f)map ldst uop inputs (80)8283flush restart other nonspec (84)85inst all (8c)inst branch (8d)inst branch taken (90)inst branch cond (94)inst int store (96)inst int alu (97)inst ldst (9b)9fl1d tlb access (a0)l1d cache miss st (a2)l1d cache miss ld (a3)a4ld unit uop (a6)st unit uop (a7)l1d cache writeback (a8)aaabacafbcl1d cache miss st nonspec (c0)l1d tlb miss nonspec (c1)c2cfd5map dispatch bubble (d6)ddfetch restart (de)e0? int output thing (e9)ea? ldst retires (ed)? int retires (ef)f5f6f7f8fd
1020910040750227491828169669120100258040565632252010010100100001010010000522171468824496960100401004086743874720100200100002002000010040122111020110099100100001001000010010904129538106941025024790240833109169116400710117111003710000210000101001004110041100411004110041
102041004075022178782617687212010025797010073262520100101001000010100100005220674688244969601004010040867438747201002001000020020000100401221110201100991001000010010000100108821299357067810249272906369011094816115900710117111003710000010000101001004110041100411004110041
1020410040750217882788172082961002581405752362520100101001000010100100005221714688244969601004010040867438747201002001000020020000100401221110201100991001000010010000100108861261397067110236258880327201092411114500710117111003710000110000101001004110041100411004110041
1020410040750215178830181672961002582606465452520100101001000010100100005222034688244969601004010040867438747201002001000020020000101951221110201100991001000010010000100109061253367068910256254882428301092110119100710117111003710000710000101001004110041100411004110041
102041004075020439883217207512010025814089107242520100101001000010100100005221634688244969601004010040867438747201002001000020020000100401221110201100991001000010010000100109341351383068110244275920328431090915119100710117111003710000010000101001004110041100411004110041
1020410040750219095835170486116100257910807528252010010100100001010010000522115468824496960100401004086743874720100200100002002000010040122111020110099100100001001000010010938132037526631023627993640838109115116400710117111003710000410000101001004110041100411004110041
10204100407502379848341600701001002579407776272520100101001000010100100005221714688244969601004010040867438747201002001000020020000100401221110201100991001000010010000100109361355379667810255297934407781092215129100710117111003710000010000101001004110041100411004110041
102041004075021427479516967412010025770063101192520100101001000010100100005221874688244969601004010040867438747201002001000020020000100401221110201100991001000010010000100108861271381066810249269894388351093716121400710117111003710000410000101001004110041100411004110041
102041004075020949183217047012010025781085107282520100101001000010100100005222034688244969601004010040867438747201002001000020020000100401221110201100991001000010010000100108781298367066410244244886327921093513121700710117111003710000010000101001004110041100411004110041
102041004075021159080317527810010025795011367382520100101001000010100100005222034688244969601004010040867438747201002001000020020000100401221110201100991001000010010000100109301354397067010244249932328501090516129871710117111003710000410000101001004110041100411004110041

1000 unrolls and 10 iterations

Result (median cycles for code): 1.0040

retire uop (01)cycle (02)031e1f2022293a3c3e3f4046494f51schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)696a6d6emap stall dispatch (70)map rewind (75)map stall (76)dispatch uop (78)map int uop (7c)map ldst uop (7d)map int uop inputs (7f)map ldst uop inputs (80)8283flush restart other nonspec (84)85inst all (8c)inst branch (8d)inst branch taken (90)inst branch cond (94)inst int store (96)inst int alu (97)inst ldst (9b)9fl1d tlb access (a0)l1d cache miss st (a2)l1d cache miss ld (a3)a4ld unit uop (a6)st unit uop (a7)l1d cache writeback (a8)aaabacafbcl1d cache miss st nonspec (c0)cfd5map dispatch bubble (d6)ddfetch restart (de)e0? int output thing (e9)ea? ldst retires (ed)? int retires (ef)f5f6f7f8fd
100291004075195387825256087027210025834108154212520010100101000010010100005211134688244969601004010040869638770200102010000202000010040124111002110910100001010000101089413903750658102312868789081910933131316640316331003710000110000100101004110041100411004110041
10024100407518218680925929212761002579711387372520010100101000010010100005211154688244969601004010040869638770200102010000202000010040124111002110910100001010000101094411973690658102512568645878710913181218640316331003710000210000100101004110041100411004110041
10024100407621189684025766902761002579667102262520010100101000010010100005211214688244969601004010040869638770200102010000202000010040124111002110910100001010000101093813123860707102372658866487710943171248640316331003710000010000100101004110041100411004110041
10024100407520379884025689802481002582913977402520010100101005210010100005211234688244969601004010040869638770200102010000202000010040124111002110910100001010000101087614033890643102552629106487110914201249640216331003710000010000100101004110041100411004110041
10024100407521039584025768302401002578012296202520010100101000010010100005211134688244969601004010040869638770200102010000202000010040124111002110910100001010000101094414683920669102362749108080510929211176640316231003710000110000100101004110041100411004110041
1002410040751965998392544771244100257751271222025200101001010000100101000052112946882449696010040100408696387702001020100002020000100401241110021109101000010100001010874137838307101024827091011884710932161162640316221003710000010000100101004110041100411004110041
10024100407620708584925449602161002576110879262520010100101000010010100005211134688244969601004010040869638770200102010000202000010040124111002110910100001010000101091814793630664102432559088680110891131188640316331003710000110000100101004110041100411004110041
100241004075201986843260876022810025833129116282520010100101000010010100005211374688244969601004010040869638770200102010000202000010040124111002110910100001010000101089613553780664102602629326493010933211208640216321003710000210000100101004110041100411004110041
100241004075212195811254485026810025786131145252520117100101000010010100005210754688244969601004010040869638770200102010000202000010040124111002110910100001010000101092412743980671102422728506276710943221128640316221003710000110000100101004110041100411004110041
100241004075226291822256888122810025851131109302520010100101000010010100005211454688244969601004010040869638770200102010000202000010040124111002110910100001010000101088813203920683102252819348481710938221205640316331003710000010000100101004110041100411004110041

Test 3: throughput

Count: 8

Code:

  str x0, [x6], #8
  str x0, [x7], #8
  str x0, [x8], #8
  str x0, [x9], #8
  str x0, [x10], #8
  str x0, [x11], #8
  str x0, [x12], #8
  str x0, [x13], #8
  mov x7, x6
  mov x8, x6
  mov x9, x6
  mov x10, x6
  mov x11, x6
  mov x12, x6
  mov x13, x6

(fused SUBS/B.cc loop)

100 unrolls and 100 iterations

Result (median cycles for code divided by count): 0.5104

retire uop (01)cycle (02)03l1d tlb fill (05)mmu table walk data (08)09l2 tlb miss data (0b)191e1f2022293a3e3f4046494f51schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)6067696a6d6emap stall dispatch (70)map rewind (75)map stall (76)dispatch uop (78)map int uop (7c)map ldst uop (7d)map int uop inputs (7f)map ldst uop inputs (80)8283flush restart other nonspec (84)85inst all (8c)inst branch (8d)inst branch taken (90)inst branch cond (94)inst int store (96)inst int alu (97)inst ldst (9b)9fl1d tlb access (a0)l1d tlb miss (a1)l1d cache miss st (a2)l1d cache miss ld (a3)a4ld unit uop (a6)st unit uop (a7)l1d cache writeback (a8)a9aaabacafbcl1d cache miss st nonspec (c0)l1d tlb miss nonspec (c1)c2c3branch cond mispred nonspec (c5)branch mispred nonspec (cb)cdcfd5map dispatch bubble (d6)ddfetch restart (de)e0? int output thing (e9)? ldst retires (ed)? int retires (ef)f5f6f7f8fd
802094062630710000186091483117201471124079978317321848183251608928070180025801008000040327818774680455493773440889407803069433085416010020080000200160000408718511802011009910080000100800001008099916441946089018061827919173217108148650842201400000511011611408268105580000801004092140925408444084640967
802044083030510000189085583317281281124073880720271855152251604848583580000801008000042638118805640356493769040839408003070133094516010020080000200160000409298711802011009910080000100800001008099514497148058738055328318933816568146754347721410000511011711408368093080000801004082240918409124088140888
802044083030600000192685582717441191004073083218972026163251652658284580000801008000040256818778130318493781640848408913067233081216010020080000200160000408608711802011009910080000100800001008103916444147949198055526909192816148147051140761410000511011611409218341780000801004091240839408674076840919
802044084430510100192389579517441221204074480517921812255251607938077480000801008000040223818724531395493787240729408563073133079016010020080000200160000408328711802011009910080000100800001008101315494648919038058727309417416988150355649491400000511011611407738030180000801004087040792407494094140833
802044079130610100191491180517041282284075178621172028270251609648468580000801008000040163218784661135449376694079540816306743308881601002008000020016000041024751180201100991008000010080000100810310513546411908805983010927901786814725424339000000511011711408088113980000801004082840847408314091040844
8020440864306000002055789799175212116040747788192217131752516067581337800098010080000401849188245602654937858407234080430715330842160100200800002001600004076785118020110099100800001008000010080993042165099921805072690888361752814585254346000000511011711408388884180000801004083040697408374080940904
80204408503050000019389978191712126120407337941948189525525160731809048000180100800004041531874800126649378844090740731307413308521601002008000020016000040813761180201100991008000010080000100810251345675007947805412920929661661814885654602000000511011711407148050580000801004080040945407964089640821
802044076030611110181279882017121151084078881019421994175251610458066880030801008000040229718756231213493773040834407433075333072716010020080000200160000409158711802011009910080000100800001008097830477949214898806172621937261512814605314572140271000511011711407728061280000801004082740877408394079340842
80204408293071011019148448771744127108409377711829186520325166454845728000080100800004042461875860169493768740882408353076333080616010020080000200160000408717611802011009910080000100800001008099016479551448958057628809321241700815945764697000000511011711409348093780000801004089040814407054089440913
8020440863306000001908866810178412011640895779203919161552516261980754800008010080000402405187729614024937748407734079230625330817160100200800002001600004083276118020110099100800001008000010080893043424612902805562710885741716814935774239000000511011711407358099080000801004080040857408054087340861

1000 unrolls and 10 iterations

Result (median cycles for code divided by count): 0.5102

retire uop (01)cycle (02)031e1f2022293a3e3f4046494f51schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)6067696a6b6d6emap stall dispatch (70)map rewind (75)map stall (76)dispatch uop (78)map int uop (7c)map ldst uop (7d)map int uop inputs (7f)map ldst uop inputs (80)8283flush restart other nonspec (84)85inst all (8c)inst branch (8d)inst branch taken (90)inst branch cond (94)inst int store (96)inst int alu (97)inst ldst (9b)9fl1d tlb access (a0)l1d cache miss st (a2)l1d cache miss ld (a3)a4ld unit uop (a6)st unit uop (a7)l1d cache writeback (a8)aaabacafbcl1d cache miss st nonspec (c0)cfd2l1i cache miss demand (d3)d5map dispatch bubble (d6)ddfetch restart (de)e0? int output thing (e9)? ldst retires (ed)? int retires (ef)f5f6f7f8fd
80029407793071848887795180013314440747770164716951442516757080628800028001080000401416187895201544937712040783408513089133088416001020800002016000040809751180021109108000010800001080869349545648618053126886132169781343504390350200651656408078330780000800104082740716409044074140731
80024408293051776843760168010111640760800168216771292516041981937800008001080000420095187708002744937748040815408213068333078616001020800002016000040786751180021109108000010800001080838460047348388055825787434164281457549369250200651756408078077280000800104077940814408104078840862
800244087330619148397931640105140407657431842170116925160556805888000080010800004015351870504018898377560408124092730710330793160010208000020160000408377511800211091080000108000010809074529524128538056428889242152181408508445550200651666408238256680000800104089840894409374078840798
80024408313061710805763165699136408247681952175815325160473805818000080010800004186751874800043874937708040764407473074233074316001020800002016000040853821180021109108000010800001080835373844278628049525280940151981459495373550200651666407618661780000800104082640795407434081240803
8002440719305183972675817849412840854758174416281432516050680415800518001080000423909187734412574937782040806407873080933086216001020800002016000040829751180021109108000010800001080871431745678428049728686936161481324566366650200651666407698041380000800104079840864408284086340797
80024408753051563876758170497124407817351717193613025163405805608000080010800004026401878832028049377310408564087130765330700160010208000020160000408177511800211091080000108000010808474365458128538061026191436164381445452352550200651655408938032480000800104073240746408134081340877
8002440853305189985378617121091444066677215771663812516052580492800728001080000401652187446402814937723040819408443077133086116001020800002016000040855821180021109108000010800001080835408945168828047027487136143281388532434650200661656407838444680000800104098740892408794087640745
80024407613061845888820171210811240784773182717441812516467184714800008001080000402437187842402844937713040767407713085833082516001020800002016000040856751180021109108000010800001080888392646788738054027384238156981474551376250200651655408178375380000800104091440822408154074140770
80024407913061854725774168811984407287461713179114325167731805188000080010800004018351876829028549377290408024090430709330868160010208000020160000408007511800211091080000108000010808493760444118848050026389550164981450520387250200751855408198291880000800104089840757407624079840797
80024407913051635870809177610815240747736186319031532516792480580800008001080000401692187900002454937723040771407623081433072316001020800002016000040769751180021109108000010800001080819381546748528048426787042155081426462432950200661755409078456080000800104083640790407324081340803