Apple Microarchitecture Research by Dougall Johnson M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions M1/A14 E-core (Icestorm): Overview | Base Instructions | SIMD and FP Instructions
Code:
stclrb w0, [x6] nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop
mov x0, 0
(no loop instructions)
Retires (minus 70 nops): 3.000
Issues: 3.002
Integer unit issues: 1.003
Load/store unit issues: 2.000
SIMD/FP unit issues: 0.000
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
73005 | 34773 | 3018 | 1014 | 2004 | 1002 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34693 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34531 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34754 | 3003 | 1003 | 2000 | 1000 | 2000 | 7769 | 10520 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34095 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34092 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34544 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34319 | 3006 | 1006 | 2000 | 1000 | 2000 | 7767 | 10518 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34127 | 3003 | 1003 | 2000 | 1000 | 2002 | 7772 | 10526 | 3003 | 1001 | 2002 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34463 | 3003 | 1003 | 2000 | 1000 | 2000 | 7767 | 10518 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
Code:
stclrb w0, [x6] add x6, x6, 2
(fused SUBS/B.cc loop)
Result (median cycles for code): 3.0066
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
40206 | 30410 | 40261 | 20216 | 20045 | 20169 | 20007 | 115761 | 105933 | 40114 | 20207 | 20007 | 30211 | 40013 | 20010 | 20000 | 20100 |
40204 | 30063 | 40110 | 20108 | 20002 | 20104 | 20004 | 115909 | 105918 | 40108 | 20204 | 20004 | 30206 | 40008 | 20012 | 20000 | 20100 |
40204 | 30063 | 40114 | 20112 | 20002 | 20104 | 20038 | 114658 | 108749 | 40176 | 20238 | 20038 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30063 | 40110 | 20108 | 20002 | 20104 | 20004 | 115875 | 105844 | 40108 | 20204 | 20004 | 30206 | 40008 | 20012 | 20000 | 20100 |
40204 | 30063 | 40114 | 20112 | 20002 | 20104 | 20036 | 112670 | 110218 | 40172 | 20236 | 20036 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30063 | 40110 | 20108 | 20002 | 20104 | 20004 | 115887 | 105878 | 40108 | 20204 | 20004 | 30206 | 40008 | 20012 | 20000 | 20100 |
40204 | 30063 | 40110 | 20108 | 20002 | 20104 | 20004 | 115879 | 105862 | 40108 | 20204 | 20004 | 30206 | 40008 | 20012 | 20000 | 20100 |
40204 | 30063 | 40110 | 20108 | 20002 | 20104 | 20004 | 115895 | 105892 | 40108 | 20204 | 20004 | 30206 | 40008 | 20012 | 20000 | 20100 |
40204 | 30063 | 40110 | 20108 | 20002 | 20104 | 20004 | 115895 | 105894 | 40108 | 20204 | 20004 | 30206 | 40008 | 20012 | 20000 | 20100 |
40204 | 30063 | 40114 | 20112 | 20002 | 20104 | 20004 | 115915 | 105932 | 40108 | 20204 | 20004 | 30206 | 40008 | 20012 | 20000 | 20100 |
Result (median cycles for code): 3.0063
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
40026 | 30378 | 40163 | 20121 | 20042 | 20074 | 20004 | 115743 | 105956 | 40018 | 20024 | 20004 | 30020 | 40000 | 20008 | 20000 | 20010 |
40024 | 30066 | 40018 | 20018 | 20000 | 20010 | 20000 | 115700 | 105875 | 40010 | 20020 | 20000 | 30020 | 40000 | 20008 | 20000 | 20010 |
40024 | 30066 | 40019 | 20019 | 20000 | 20010 | 20000 | 115695 | 105865 | 40010 | 20020 | 20000 | 30020 | 40000 | 20009 | 20000 | 20010 |
40024 | 30066 | 40019 | 20019 | 20000 | 20010 | 20000 | 115707 | 105889 | 40010 | 20020 | 20000 | 30020 | 40000 | 20009 | 20000 | 20010 |
40024 | 30066 | 40019 | 20019 | 20000 | 20010 | 20000 | 115697 | 105873 | 40010 | 20020 | 20000 | 30020 | 40000 | 20009 | 20000 | 20010 |
40024 | 30066 | 40019 | 20019 | 20000 | 20010 | 20000 | 115718 | 105911 | 40010 | 20020 | 20000 | 30020 | 40000 | 20009 | 20000 | 20010 |
40024 | 30066 | 40019 | 20019 | 20000 | 20010 | 20000 | 115703 | 105880 | 40010 | 20020 | 20000 | 30020 | 40000 | 20009 | 20000 | 20010 |
40024 | 30066 | 40019 | 20019 | 20000 | 20010 | 20000 | 115712 | 105899 | 40010 | 20020 | 20000 | 30020 | 40000 | 20008 | 20000 | 20010 |
40025 | 30131 | 40097 | 20061 | 20036 | 20048 | 20000 | 115718 | 105918 | 40010 | 20020 | 20000 | 30020 | 40000 | 20008 | 20000 | 20010 |
40024 | 30066 | 40019 | 20019 | 20000 | 20010 | 20000 | 115670 | 105821 | 40010 | 20020 | 20000 | 30020 | 40000 | 20009 | 20000 | 20010 |
Code:
stclrb w0, [x6]
mov x7, 8
(fused SUBS/B.cc loop)
Result (median cycles for code): 12.9752
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
30205 | 131942 | 41623 | 21565 | 20058 | 10132 | 20000 | 2365416 | 2238834 | 30100 | 10200 | 20000 | 20200 | 40000 | 21044 | 20000 | 10100 |
30204 | 125680 | 41149 | 21144 | 20005 | 10100 | 20000 | 2365424 | 2238835 | 30100 | 10200 | 20000 | 20200 | 40000 | 21044 | 20000 | 10100 |
30204 | 125673 | 41148 | 21143 | 20005 | 10100 | 20000 | 2365424 | 2238835 | 30100 | 10200 | 20000 | 20200 | 40000 | 21044 | 20000 | 10100 |
30204 | 125673 | 41148 | 21143 | 20005 | 10100 | 20000 | 2365424 | 2238835 | 30100 | 10200 | 20000 | 20200 | 40000 | 21044 | 20000 | 10100 |
30204 | 125673 | 41148 | 21143 | 20005 | 10100 | 20000 | 2395030 | 2264844 | 30100 | 10200 | 20000 | 20200 | 40000 | 21209 | 20000 | 10100 |
30204 | 125693 | 41146 | 21141 | 20005 | 10100 | 20185 | 2442646 | 2308178 | 30391 | 10306 | 20206 | 20200 | 40000 | 21044 | 20000 | 10100 |
30204 | 125676 | 41148 | 21143 | 20005 | 10100 | 20000 | 2365424 | 2238835 | 30100 | 10200 | 20000 | 20200 | 40000 | 21102 | 20000 | 10100 |
30204 | 126469 | 41201 | 21201 | 20000 | 10100 | 20000 | 2382541 | 2252434 | 30100 | 10200 | 20000 | 20200 | 40000 | 21102 | 20000 | 10100 |
30204 | 126469 | 41201 | 21201 | 20000 | 10100 | 20000 | 2382142 | 2252074 | 30100 | 10200 | 20000 | 20200 | 40000 | 21102 | 20000 | 10100 |
30204 | 126469 | 41201 | 21201 | 20000 | 10100 | 20000 | 2382541 | 2252434 | 30100 | 10200 | 20000 | 20200 | 40000 | 21102 | 20000 | 10100 |
Result (median cycles for code): 12.9754
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
30024 | 129325 | 41249 | 21244 | 20005 | 10015 | 20000 | 2513781 | 2363347 | 30010 | 10020 | 20000 | 20118 | 40194 | 21068 | 20000 | 10010 |
30024 | 129754 | 41288 | 21288 | 20000 | 10011 | 20000 | 2456030 | 2312867 | 30010 | 10020 | 20000 | 20078 | 40110 | 20932 | 20000 | 10010 |
30024 | 129767 | 41234 | 21234 | 20000 | 10010 | 20000 | 2456183 | 2312993 | 30010 | 10020 | 20000 | 20020 | 40000 | 21271 | 20000 | 10010 |
30024 | 129754 | 41281 | 21281 | 20000 | 10010 | 20051 | 2382433 | 2248789 | 30091 | 10050 | 20058 | 20020 | 40000 | 21257 | 20000 | 10010 |
30024 | 129754 | 41281 | 21281 | 20000 | 10010 | 20000 | 2456145 | 2312957 | 30010 | 10020 | 20000 | 20020 | 40000 | 21271 | 20000 | 10010 |
30024 | 129754 | 41281 | 21281 | 20000 | 10010 | 20000 | 2456145 | 2312957 | 30010 | 10020 | 20000 | 20020 | 40000 | 21271 | 20000 | 10010 |
30024 | 129757 | 41279 | 21279 | 20000 | 10010 | 20000 | 2456093 | 2312921 | 30010 | 10020 | 20000 | 20020 | 40000 | 21271 | 20000 | 10010 |
30024 | 129754 | 41281 | 21281 | 20000 | 10010 | 20000 | 2456145 | 2312957 | 30010 | 10020 | 20000 | 20020 | 40000 | 21271 | 20000 | 10010 |
30025 | 124042 | 40687 | 20642 | 20045 | 10038 | 20000 | 2456145 | 2312957 | 30010 | 10020 | 20000 | 20020 | 40000 | 21271 | 20000 | 10010 |
30024 | 129754 | 41281 | 21281 | 20000 | 10010 | 20000 | 2456145 | 2312957 | 30010 | 10020 | 20000 | 20020 | 40000 | 21271 | 20000 | 10010 |