Apple Microarchitecture Research by Dougall Johnson

M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions
M1/A14 E-core (Icestorm):  Overview | Base Instructions | SIMD and FP Instructions

STLXRH

Test 1: uops

Code:

  stlxrh w0, w1, [x6]
  mov x0, 0

(no loop instructions)

1000 unrolls and 1 iteration

Retires: 1.000

Issues: 1.000

Integer unit issues: 0.001

Load/store unit issues: 1.000

SIMD/FP unit issues: 0.000

retire uop (01)cycle (02)schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch ldst uop (58)simd uops in schedulers (5a)dispatch uop (78)map ldst uop (7d)map ldst uop inputs (80)? int output thing (e9)? ldst retires (ed)
1005315410191101810005185510001000200011000
1004304710011100010005185510001000200011000
1004304710011100010005185510001000200011000
1004304710011100010005185510001000200011000
1004304710011100010005185510001000200011000
1004304710011100010005185510001000200011000
1004304710011100010005185510001000200011000
1004304710011100010005185510001000200011000
1004304710011100010005185510001000200011000
1004304710011100010005185510001000200011000

Test 2: throughput

Code:

  stlxrh w0, w1, [x6]
  add x6, x6, 2

(fused SUBS/B.cc loop)

100 unrolls and 100 iterations

Result (median cycles for code): 3.0417

retire uop (01)cycle (02)schedule uop (52)schedule int uop (53)schedule simd uop (54)schedule ldst uop (55)dispatch int uop (56)dispatch simd uop (57)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)dispatch uop (78)map int uop (7c)map ldst uop (7d)map int uop inputs (7f)map ldst uop inputs (80)? int output thing (e9)? ldst retires (ed)? int retires (ef)
2020630675201941015801003610157010003354793393852010610203100031020320006100041000010100
2020430435201041010401000010103010002354673391302010410202100021020220004100031000010100
2020430420201031010301000010102010002354673391302010410202100021020220004100031000010100
2020430421201031010301000010102010002354673393392010410202100021020220004100031000010100
2020430433201031010301000010102010002354673390692010410202100021020220004100031000010100
2020430414201031010301000010102010002354673390402010410202100021020220004100031000010100
2020430425201031010301000010102010002354673390372010410202100021020220004100031000010100
2020430421201031010301000010102010002354673391082010410202100021020220004100031000010100
2020430423201031010301000010102010002354673392092010410202100021020220004100031000010100
2020430417201031010301000010102010002354673391562010410202100021020220004100031000010100

1000 unrolls and 10 iterations

Result (median cycles for code): 3.0456

retire uop (01)cycle (02)schedule uop (52)schedule int uop (53)schedule simd uop (54)schedule ldst uop (55)dispatch int uop (56)dispatch simd uop (57)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)dispatch uop (78)map int uop (7c)map ldst uop (7d)map int uop inputs (7f)map ldst uop inputs (80)? int output thing (e9)? ldst retires (ed)? int retires (ef)
2002630706201021006601003610065010002352423399072001410022100021002020000100011000010010
2002430467200111001101000010010010000352353395422001010020100001002020000100011000010010
2002430445200111001101000010010010000352353398182001010020100001002020000100011000010010
2002430465200111001101000010010010000352353394162001010020100001002020000100011000010010
2002430448200111001101000010010010000352353395072001010020100001008420128100631000010010
2002430464200111001101000010010010000352353399412001010020100001002020000100011000010010
378735772335132185557116506180207310000352353399312001010020100001002220004100031000010010
2002430462200141001401000010013010000352353397662001010020100001002020000100011000010010
2002430466200111001101000010010010000352353401082001010020100001002020000100011000010010
2002430439200111001101000010010010000352353397152001010020100001002020000100011000010010

Test 3: throughput

Code:

  stlxrh w0, w1, [x6]
  mov x7, 8

(fused SUBS/B.cc loop)

100 unrolls and 100 iterations

Result (median cycles for code): 3.0047

retire uop (01)cycle (02)schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)dispatch uop (78)map int uop (7c)map ldst uop (7d)map int uop inputs (7f)map ldst uop inputs (80)? int output thing (e9)? ldst retires (ed)? simd retires (ee)? int retires (ef)
10205301501011910110018100100003005288551010020010004200200081100000100
10204300471010110110000100100003005288551010020010004200200081100000100
10204300471010110110000100100003005288551010020010004200200081100000100
10204300471010110110000100100003005288551010020010004200200081100000100
10204300471010110110000100100003005288551010020010004200200081100000100
10204300471010110110000100100003005288551010020010004200200081100000100
10204300471010110110000100100003005288551010020010004200200081100000100
10204300471010110110000100100003005288551010020010004200200081100000100
10204300471010110110000100100003005288551010020010004200200081100000100
10204300471010110110000100100003005288551010020010004200200081100000100

1000 unrolls and 10 iterations

Result (median cycles for code): 3.0047

retire uop (01)cycle (02)schedule uop (52)schedule int uop (53)schedule ldst uop (55)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)dispatch uop (78)map int uop (7c)map ldst uop (7d)map int uop inputs (7f)map ldst uop inputs (80)? int output thing (e9)? ldst retires (ed)? int retires (ef)
1002530159100291110018101000030528945100102010000202000011000010
1002430049100111110000101000030528855100102010000202000011000010
1002430050100111110000101000030528909100102010000202000011000010
1002430049100111110000101000030528999100102010000202000011000010
1002430054100111110000101000030528855100102010000202000011000010
1002430047100111110000101000030528855100102010000202000011000010
1002430047100111110000101000030528855100102010000202000011000010
1002430117100111110000101000030529089100102010000202000011000010
1002430053100111110000101003630530169100462010044202000011000010
1002430068100111110000101000030528855100102010004202000011000010