Apple Microarchitecture Research by Dougall Johnson M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions M1/A14 E-core (Icestorm): Overview | Base Instructions | SIMD and FP Instructions
Code:
stclr x0, [x6] nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop
mov x0, 0
(no loop instructions)
Retires (minus 70 nops): 3.000
Issues: 3.002
Integer unit issues: 1.003
Load/store unit issues: 2.000
SIMD/FP unit issues: 0.000
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
73005 | 34447 | 3019 | 1015 | 2004 | 1002 | 2000 | 7767 | 10518 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34124 | 3003 | 1003 | 2000 | 1000 | 2000 | 7767 | 10518 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34415 | 3003 | 1003 | 2000 | 1000 | 2000 | 7767 | 10518 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34386 | 3003 | 1003 | 2000 | 1000 | 2000 | 7767 | 10518 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34154 | 3003 | 1003 | 2000 | 1000 | 2000 | 7767 | 10518 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34118 | 3003 | 1003 | 2000 | 1000 | 2000 | 7767 | 10518 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34113 | 3003 | 1003 | 2000 | 1000 | 2000 | 7767 | 10518 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34113 | 3003 | 1003 | 2000 | 1000 | 2000 | 7767 | 10518 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34153 | 3003 | 1003 | 2000 | 1000 | 2002 | 7775 | 10530 | 3003 | 1001 | 2002 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34123 | 3003 | 1003 | 2000 | 1000 | 2000 | 7767 | 10518 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
Code:
stclr x0, [x6] add x6, x6, 8
(fused SUBS/B.cc loop)
Result (median cycles for code): 3.0066
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
40209 | 31055 | 40531 | 20430 | 20101 | 20261 | 20007 | 116768 | 107794 | 40114 | 20207 | 20007 | 30211 | 40013 | 20010 | 20000 | 20100 |
40204 | 30066 | 40115 | 20110 | 20005 | 20107 | 20004 | 116837 | 107646 | 40108 | 20204 | 20004 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30066 | 40110 | 20108 | 20002 | 20104 | 20004 | 116839 | 107648 | 40108 | 20204 | 20004 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30066 | 40110 | 20108 | 20002 | 20104 | 20004 | 116814 | 107628 | 40108 | 20204 | 20004 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30066 | 40110 | 20108 | 20002 | 20104 | 20004 | 116839 | 107650 | 40108 | 20204 | 20004 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30066 | 40110 | 20108 | 20002 | 20104 | 20004 | 116821 | 107638 | 40108 | 20204 | 20004 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30066 | 40110 | 20108 | 20002 | 20104 | 20004 | 116832 | 107648 | 40108 | 20204 | 20004 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30066 | 40110 | 20108 | 20002 | 20104 | 20004 | 116828 | 107646 | 40108 | 20204 | 20004 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30066 | 40110 | 20108 | 20002 | 20104 | 20004 | 116803 | 107606 | 40108 | 20204 | 20004 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30066 | 40110 | 20108 | 20002 | 20104 | 20004 | 116810 | 107620 | 40108 | 20204 | 20004 | 30206 | 40008 | 20008 | 20000 | 20100 |
Result (median cycles for code): 3.0063
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
40029 | 30829 | 40402 | 20305 | 20097 | 20164 | 20004 | 116587 | 107585 | 40018 | 20024 | 20004 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30066 | 40017 | 20017 | 20000 | 20010 | 20000 | 116578 | 107579 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30066 | 40017 | 20017 | 20000 | 20010 | 20000 | 116679 | 107776 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30089 | 40020 | 20018 | 20002 | 20014 | 20000 | 116473 | 107483 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30063 | 40017 | 20017 | 20000 | 20010 | 20000 | 116487 | 107511 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30063 | 40017 | 20017 | 20000 | 20010 | 20000 | 116483 | 107503 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30063 | 40017 | 20017 | 20000 | 20010 | 20000 | 116483 | 107503 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30063 | 40017 | 20017 | 20000 | 20010 | 20000 | 116401 | 107350 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30063 | 40017 | 20017 | 20000 | 20010 | 20000 | 116472 | 107481 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30063 | 40017 | 20017 | 20000 | 20010 | 20000 | 116472 | 107481 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
Code:
stclr x0, [x6]
mov x7, 8
(fused SUBS/B.cc loop)
Result (median cycles for code): 12.9058
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
30205 | 130291 | 41394 | 21342 | 20052 | 10134 | 20000 | 2450042 | 2311933 | 30100 | 10200 | 20000 | 20260 | 40114 | 20679 | 20000 | 10100 |
30204 | 129754 | 41409 | 21409 | 20000 | 10100 | 20000 | 2449943 | 2311807 | 30100 | 10200 | 20000 | 20200 | 40000 | 21309 | 20000 | 10100 |
30204 | 129754 | 41409 | 21409 | 20000 | 10100 | 20000 | 2449943 | 2311807 | 30100 | 10200 | 20000 | 20200 | 40000 | 21309 | 20000 | 10100 |
30204 | 129754 | 41409 | 21409 | 20000 | 10100 | 20000 | 2449943 | 2311807 | 30100 | 10200 | 20000 | 20200 | 40000 | 21309 | 20000 | 10100 |
30204 | 129754 | 41409 | 21409 | 20000 | 10100 | 20000 | 2449943 | 2311807 | 30100 | 10200 | 20000 | 20200 | 40000 | 21309 | 20000 | 10100 |
30205 | 129828 | 41453 | 21415 | 20038 | 10121 | 20000 | 2449838 | 2311764 | 30100 | 10200 | 20000 | 20200 | 40000 | 21309 | 20000 | 10100 |
30204 | 129754 | 41409 | 21409 | 20000 | 10100 | 20000 | 2449943 | 2311807 | 30100 | 10200 | 20000 | 20200 | 40000 | 21309 | 20000 | 10100 |
30204 | 129754 | 41409 | 21409 | 20000 | 10100 | 20000 | 2449943 | 2311807 | 30100 | 10200 | 20000 | 20200 | 40000 | 21309 | 20000 | 10100 |
30204 | 129754 | 41409 | 21409 | 20000 | 10100 | 20000 | 2449943 | 2311807 | 30100 | 10200 | 20000 | 20200 | 40000 | 21309 | 20000 | 10100 |
30204 | 129754 | 41409 | 21409 | 20000 | 10100 | 20000 | 2449943 | 2311807 | 30100 | 10200 | 20000 | 20200 | 40000 | 21102 | 20000 | 10100 |
Result (median cycles for code): 12.9754
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
30025 | 146993 | 42775 | 22745 | 20030 | 10035 | 20000 | 2456245 | 2313083 | 30010 | 10020 | 20000 | 20020 | 40000 | 21273 | 20000 | 10010 |
30024 | 129746 | 41245 | 21245 | 20000 | 10010 | 20000 | 2456245 | 2313047 | 30010 | 10020 | 20000 | 20020 | 40000 | 21272 | 20000 | 10010 |
30025 | 125167 | 40879 | 20833 | 20046 | 10040 | 20000 | 2456143 | 2312957 | 30010 | 10020 | 20000 | 20020 | 40000 | 21272 | 20000 | 10010 |
30024 | 129754 | 41282 | 21282 | 20000 | 10010 | 20000 | 2456143 | 2312957 | 30010 | 10020 | 20000 | 20020 | 40000 | 21272 | 20000 | 10010 |
30024 | 129754 | 41282 | 21282 | 20000 | 10010 | 20000 | 2456143 | 2312957 | 30010 | 10020 | 20000 | 20020 | 40000 | 21272 | 20000 | 10010 |
30024 | 129754 | 41282 | 21282 | 20000 | 10010 | 20000 | 2456143 | 2312957 | 30010 | 10020 | 20000 | 20020 | 40000 | 21272 | 20000 | 10010 |
30024 | 129754 | 41282 | 21282 | 20000 | 10010 | 20050 | 2456199 | 2313105 | 30088 | 10048 | 20056 | 20020 | 40000 | 21243 | 20000 | 10010 |
30024 | 129754 | 41282 | 21282 | 20000 | 10010 | 20000 | 2456143 | 2312957 | 30010 | 10020 | 20000 | 20020 | 40000 | 21272 | 20000 | 10010 |
30024 | 129754 | 41282 | 21282 | 20000 | 10010 | 20000 | 2456143 | 2312957 | 30010 | 10020 | 20000 | 20020 | 40000 | 21272 | 20000 | 10010 |
30024 | 129754 | 41282 | 21282 | 20000 | 10010 | 20000 | 2456143 | 2312957 | 30010 | 10020 | 20000 | 20020 | 40000 | 21272 | 20000 | 10010 |