Apple Microarchitecture Research by Dougall Johnson M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions M1/A14 E-core (Icestorm): Overview | Base Instructions | SIMD and FP Instructions
Code:
stclrl w0, [x6] nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop
mov x0, 0
(no loop instructions)
Retires (minus 70 nops): 3.000
Issues: 3.001
Integer unit issues: 1.002
Load/store unit issues: 2.000
SIMD/FP unit issues: 0.000
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
73005 | 34864 | 3034 | 1024 | 2010 | 1005 | 2000 | 7762 | 10513 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
73004 | 34128 | 3002 | 1002 | 2000 | 1000 | 2000 | 7762 | 10513 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
73004 | 34134 | 3002 | 1002 | 2000 | 1000 | 2000 | 7762 | 10513 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
73004 | 34153 | 3002 | 1002 | 2000 | 1000 | 2000 | 7762 | 10513 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
73004 | 34134 | 3002 | 1002 | 2000 | 1000 | 2000 | 7762 | 10513 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
73004 | 34125 | 3002 | 1002 | 2000 | 1000 | 2000 | 7762 | 10513 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
73004 | 34141 | 3002 | 1002 | 2000 | 1000 | 2002 | 7765 | 10519 | 3003 | 1001 | 2002 | 2000 | 4000 | 1002 | 2000 | 1000 |
73004 | 34429 | 3002 | 1002 | 2000 | 1000 | 2000 | 7762 | 10513 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
73004 | 34135 | 3002 | 1002 | 2000 | 1000 | 2000 | 7762 | 10513 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
73004 | 34119 | 3002 | 1002 | 2000 | 1000 | 2000 | 7762 | 10513 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
Code:
stclrl w0, [x6] add x6, x6, 4
(fused SUBS/B.cc loop)
Result (median cycles for code): 6.0065
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
40207 | 60376 | 40304 | 20249 | 20055 | 20169 | 20005 | 115674 | 95959 | 40110 | 20205 | 20005 | 30208 | 40009 | 20008 | 20000 | 20100 |
40204 | 60065 | 40106 | 20106 | 20000 | 20102 | 20002 | 115783 | 95699 | 40104 | 20202 | 20002 | 30203 | 40004 | 20006 | 20000 | 20100 |
40204 | 60065 | 40106 | 20106 | 20000 | 20102 | 20002 | 115783 | 95700 | 40104 | 20202 | 20002 | 30203 | 40004 | 20006 | 20000 | 20100 |
40204 | 60065 | 40106 | 20106 | 20000 | 20102 | 20002 | 115777 | 95689 | 40104 | 20202 | 20002 | 30203 | 40004 | 20006 | 20000 | 20100 |
40204 | 60065 | 40106 | 20106 | 20000 | 20102 | 20002 | 115773 | 95681 | 40104 | 20202 | 20002 | 30203 | 40004 | 20006 | 20000 | 20100 |
40204 | 60065 | 40106 | 20106 | 20000 | 20102 | 20002 | 115785 | 95701 | 40104 | 20202 | 20002 | 30203 | 40004 | 20006 | 20000 | 20100 |
40204 | 60065 | 40106 | 20106 | 20000 | 20102 | 20036 | 97914 | 103837 | 40172 | 20236 | 20036 | 30203 | 40004 | 20006 | 20000 | 20100 |
40204 | 60065 | 40106 | 20106 | 20000 | 20102 | 20002 | 115785 | 95705 | 40104 | 20202 | 20002 | 30203 | 40004 | 20006 | 20000 | 20100 |
40204 | 60065 | 40106 | 20106 | 20000 | 20102 | 20002 | 115777 | 95688 | 40104 | 20202 | 20002 | 30203 | 40004 | 20006 | 20000 | 20100 |
40204 | 60066 | 40106 | 20106 | 20000 | 20102 | 20002 | 115769 | 95664 | 40104 | 20202 | 20002 | 30203 | 40004 | 20006 | 20000 | 20100 |
Result (median cycles for code): 6.0065
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? simd retires (ee) | ? int retires (ef) |
40027 | 60376 | 40210 | 20159 | 20051 | 20074 | 20036 | 95936 | 107875 | 40082 | 20056 | 20036 | 30023 | 40004 | 20006 | 20000 | 0 | 20010 |
40024 | 60065 | 40016 | 20016 | 20000 | 20010 | 20000 | 115546 | 95592 | 40010 | 20020 | 20000 | 30020 | 40000 | 20006 | 20000 | 0 | 20010 |
40024 | 60065 | 40016 | 20016 | 20000 | 20010 | 20000 | 115550 | 95599 | 40010 | 20020 | 20000 | 30020 | 40000 | 20006 | 20000 | 0 | 20010 |
40024 | 60065 | 40016 | 20016 | 20000 | 20010 | 20000 | 115552 | 95603 | 40010 | 20020 | 20000 | 30020 | 40000 | 20006 | 20000 | 0 | 20010 |
40024 | 60065 | 40016 | 20016 | 20000 | 20010 | 20000 | 115552 | 95601 | 40010 | 20020 | 20000 | 30020 | 40000 | 20006 | 20000 | 0 | 20010 |
40024 | 60065 | 40016 | 20016 | 20000 | 20010 | 20000 | 115550 | 95597 | 40010 | 20020 | 20000 | 30020 | 40000 | 20006 | 20000 | 0 | 20010 |
40024 | 60065 | 40016 | 20016 | 20000 | 20010 | 20000 | 115556 | 95607 | 40010 | 20020 | 20000 | 30020 | 40000 | 20006 | 20000 | 0 | 20010 |
40024 | 60065 | 40016 | 20016 | 20000 | 20010 | 20000 | 115552 | 95601 | 40010 | 20020 | 20000 | 30020 | 40000 | 20006 | 20000 | 0 | 20010 |
40024 | 60065 | 40016 | 20016 | 20000 | 20010 | 20000 | 115548 | 95594 | 40010 | 20020 | 20000 | 30020 | 40000 | 20006 | 20000 | 0 | 20010 |
40025 | 60113 | 40085 | 20053 | 20032 | 20044 | 20000 | 115558 | 95615 | 40010 | 20020 | 20000 | 30020 | 40000 | 20006 | 20000 | 0 | 20010 |
Code:
stclrl w0, [x6]
mov x7, 8
(fused SUBS/B.cc loop)
Result (median cycles for code): 10.7766
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
30206 | 113158 | 44890 | 22195 | 22695 | 12816 | 21264 | 2017391 | 1951844 | 32465 | 11304 | 22111 | 23514 | 45775 | 20637 | 20000 | 10100 |
30204 | 108201 | 41365 | 20060 | 21305 | 11243 | 20296 | 1951178 | 1910267 | 30636 | 10441 | 20479 | 20948 | 41382 | 19426 | 20000 | 10100 |
30204 | 107383 | 40316 | 19692 | 20624 | 10444 | 20168 | 1925796 | 1896132 | 30384 | 10316 | 20219 | 21597 | 42525 | 19847 | 20000 | 10100 |
30204 | 106851 | 40707 | 19970 | 20737 | 10901 | 20931 | 1982614 | 1924379 | 31857 | 11034 | 21607 | 22554 | 44265 | 20102 | 20000 | 10100 |
30204 | 107783 | 41126 | 20019 | 21107 | 11104 | 20742 | 1965430 | 1921714 | 31508 | 10866 | 21283 | 20398 | 40356 | 19077 | 20000 | 10100 |
30205 | 106772 | 39900 | 19331 | 20569 | 10383 | 21321 | 1964440 | 1922110 | 32471 | 11251 | 22047 | 20896 | 41324 | 19268 | 20000 | 10100 |
30204 | 109777 | 42572 | 20833 | 21739 | 11896 | 21351 | 1970336 | 1918208 | 32633 | 11384 | 22286 | 20609 | 40777 | 19151 | 20000 | 10100 |
30204 | 106930 | 40616 | 19590 | 21026 | 10833 | 20762 | 1967186 | 1915863 | 31428 | 10766 | 21228 | 22528 | 44021 | 19966 | 20000 | 10100 |
30204 | 106840 | 40640 | 19823 | 20817 | 10804 | 20589 | 1949881 | 1914019 | 31179 | 10690 | 20954 | 20352 | 40266 | 19051 | 20000 | 10100 |
30204 | 106062 | 39411 | 19086 | 20325 | 10171 | 20293 | 1895132 | 1880064 | 30644 | 10451 | 20497 | 22608 | 44490 | 20173 | 20000 | 10100 |
Result (median cycles for code): 11.4047
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
30026 | 114637 | 46784 | 22506 | 24278 | 14583 | 25205 | 2083741 | 2008492 | 39966 | 14771 | 28756 | 28996 | 55973 | 22249 | 20000 | 10010 |
30024 | 114156 | 46665 | 22095 | 24570 | 14458 | 25046 | 2080271 | 2007478 | 39446 | 14412 | 28258 | 28404 | 54951 | 22512 | 20000 | 10010 |
30024 | 114028 | 46859 | 22552 | 24307 | 13689 | 24279 | 2073370 | 1998978 | 37987 | 13719 | 27220 | 29049 | 55938 | 22021 | 20000 | 10010 |
30024 | 114289 | 46372 | 22220 | 24152 | 14031 | 24476 | 2082312 | 2006255 | 38450 | 13985 | 27355 | 28210 | 54482 | 22093 | 20000 | 10010 |
30024 | 113975 | 45933 | 22292 | 23641 | 13576 | 24442 | 2087581 | 2010483 | 38458 | 14026 | 27592 | 28330 | 54950 | 22072 | 20000 | 10010 |
30024 | 113805 | 45763 | 22326 | 23437 | 13379 | 24448 | 2076018 | 2001456 | 38400 | 13965 | 27515 | 28363 | 54756 | 22368 | 20000 | 10010 |
30024 | 113875 | 46096 | 22138 | 23958 | 14088 | 24411 | 2079487 | 2003316 | 38338 | 13937 | 27506 | 27734 | 53953 | 22402 | 20000 | 10010 |
30024 | 113601 | 46331 | 22224 | 24107 | 14625 | 25390 | 2080193 | 2005345 | 40066 | 14692 | 28937 | 28478 | 55121 | 22582 | 20000 | 10010 |
30024 | 113956 | 46256 | 22133 | 24123 | 14631 | 24940 | 2087657 | 2011550 | 39328 | 14404 | 28169 | 27688 | 53911 | 22419 | 20000 | 10010 |
30024 | 113842 | 46153 | 22056 | 24097 | 14188 | 25082 | 2080325 | 2005460 | 39581 | 14509 | 28401 | 28094 | 54322 | 22493 | 20000 | 10010 |