Apple Microarchitecture Research by Dougall Johnson M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions M1/A14 E-core (Icestorm): Overview | Base Instructions | SIMD and FP Instructions
Code:
stclrh w0, [x6] nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop
mov x0, 0
(no loop instructions)
Retires (minus 70 nops): 3.000
Issues: 3.001
Integer unit issues: 1.002
Load/store unit issues: 2.000
SIMD/FP unit issues: 0.000
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
73005 | 34684 | 3018 | 1014 | 2004 | 1002 | 2000 | 7760 | 10511 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
73004 | 34247 | 3002 | 1002 | 2000 | 1000 | 2000 | 7760 | 10511 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
73005 | 34295 | 3005 | 1003 | 2002 | 1001 | 2000 | 7760 | 10511 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
73004 | 34213 | 3002 | 1002 | 2000 | 1000 | 2000 | 7760 | 10511 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
73004 | 34220 | 3002 | 1002 | 2000 | 1000 | 2000 | 7760 | 10511 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
73004 | 34240 | 3002 | 1002 | 2000 | 1000 | 2000 | 7760 | 10511 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
73004 | 34216 | 3002 | 1002 | 2000 | 1000 | 2000 | 7760 | 10511 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
73004 | 34215 | 3002 | 1002 | 2000 | 1000 | 2000 | 7760 | 10511 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
73004 | 34221 | 3002 | 1002 | 2000 | 1000 | 2000 | 7760 | 10511 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
73004 | 34279 | 3002 | 1002 | 2000 | 1000 | 2000 | 7760 | 10511 | 3000 | 1000 | 2000 | 2000 | 4000 | 1002 | 2000 | 1000 |
Code:
stclrh w0, [x6] add x6, x6, 2
(fused SUBS/B.cc loop)
Result (median cycles for code): 3.0056
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? simd retires (ee) | ? int retires (ef) |
40208 | 30636 | 40378 | 20271 | 20107 | 20230 | 20004 | 115864 | 105860 | 40108 | 20204 | 20004 | 30211 | 40013 | 20010 | 20000 | 0 | 20100 |
40204 | 30063 | 40110 | 20108 | 20002 | 20104 | 20004 | 115734 | 105751 | 40108 | 20204 | 20004 | 30206 | 40008 | 20007 | 20000 | 0 | 20100 |
40204 | 30056 | 40109 | 20107 | 20002 | 20104 | 20004 | 115727 | 105737 | 40108 | 20204 | 20004 | 30206 | 40008 | 20007 | 20000 | 0 | 20100 |
40204 | 30056 | 40109 | 20107 | 20002 | 20104 | 20004 | 115700 | 105689 | 40108 | 20204 | 20004 | 30206 | 40008 | 20007 | 20000 | 0 | 20100 |
40204 | 30056 | 40109 | 20107 | 20002 | 20104 | 20004 | 115728 | 105739 | 40108 | 20204 | 20004 | 30206 | 40008 | 20007 | 20000 | 0 | 20100 |
40204 | 30056 | 40109 | 20107 | 20002 | 20104 | 20004 | 115722 | 105727 | 40108 | 20204 | 20004 | 30206 | 40008 | 20007 | 20000 | 0 | 20100 |
40204 | 30056 | 40110 | 20108 | 20002 | 20104 | 20004 | 115719 | 105721 | 40108 | 20204 | 20004 | 30206 | 40008 | 20007 | 20000 | 0 | 20100 |
40204 | 30056 | 40109 | 20107 | 20002 | 20104 | 20004 | 115725 | 105733 | 40108 | 20204 | 20004 | 30206 | 40008 | 20007 | 20000 | 0 | 20100 |
40204 | 30056 | 40109 | 20107 | 20002 | 20104 | 20035 | 113173 | 112327 | 40172 | 20237 | 20035 | 30206 | 40008 | 20008 | 20000 | 0 | 20100 |
40204 | 30056 | 40109 | 20107 | 20002 | 20104 | 20004 | 115712 | 105709 | 40108 | 20204 | 20004 | 30206 | 40008 | 20008 | 20000 | 0 | 20100 |
Result (median cycles for code): 3.0059
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
40026 | 30364 | 40155 | 20115 | 20040 | 20074 | 20004 | 115741 | 105943 | 40018 | 20024 | 20004 | 30020 | 40000 | 20008 | 20000 | 20010 |
40024 | 30066 | 40018 | 20018 | 20000 | 20010 | 20004 | 116075 | 106280 | 40018 | 20024 | 20004 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30059 | 40017 | 20017 | 20000 | 20010 | 20000 | 115576 | 105767 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30059 | 40017 | 20017 | 20000 | 20010 | 20000 | 115563 | 105746 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30059 | 40017 | 20017 | 20000 | 20010 | 20000 | 115563 | 105747 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30059 | 40017 | 20017 | 20000 | 20010 | 20000 | 115571 | 105757 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30059 | 40017 | 20017 | 20000 | 20010 | 20000 | 115569 | 105753 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30059 | 40017 | 20017 | 20000 | 20010 | 20000 | 115577 | 105775 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30059 | 40017 | 20017 | 20000 | 20010 | 20000 | 115561 | 105741 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30059 | 40017 | 20017 | 20000 | 20010 | 20000 | 115576 | 105769 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
Code:
stclrh w0, [x6]
mov x7, 8
(fused SUBS/B.cc loop)
Result (median cycles for code): 12.6476
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
30205 | 134360 | 41753 | 21698 | 20055 | 10130 | 20050 | 2394335 | 2264975 | 30178 | 10228 | 20056 | 20200 | 40000 | 21160 | 20000 | 10100 |
30204 | 127246 | 41286 | 21260 | 20026 | 10100 | 20000 | 2394057 | 2266338 | 30100 | 10200 | 20000 | 20240 | 40077 | 21396 | 20000 | 10100 |
30204 | 127359 | 41253 | 21247 | 20006 | 10100 | 20007 | 2458238 | 2320834 | 30114 | 10207 | 20013 | 20200 | 40000 | 21102 | 20000 | 10100 |
30204 | 126476 | 41202 | 21202 | 20000 | 10100 | 20000 | 2382600 | 2252452 | 30100 | 10200 | 20000 | 20200 | 40000 | 21101 | 20000 | 10100 |
30205 | 124729 | 40938 | 20892 | 20046 | 10130 | 20000 | 2382436 | 2252308 | 30100 | 10200 | 20000 | 20200 | 40000 | 21101 | 20000 | 10100 |
30204 | 126476 | 41202 | 21202 | 20000 | 10100 | 20000 | 2382436 | 2252308 | 30100 | 10200 | 20000 | 20200 | 40000 | 21101 | 20000 | 10100 |
30204 | 126476 | 41202 | 21202 | 20000 | 10100 | 20000 | 2382436 | 2252308 | 30100 | 10200 | 20000 | 20200 | 40000 | 21101 | 20000 | 10100 |
30204 | 126476 | 41202 | 21202 | 20000 | 10100 | 20000 | 2382436 | 2252308 | 30100 | 10200 | 20000 | 20200 | 40000 | 21101 | 20000 | 10100 |
30204 | 126476 | 41202 | 21202 | 20000 | 10100 | 20051 | 2428633 | 2294471 | 30181 | 10230 | 20058 | 20200 | 40000 | 21101 | 20000 | 10100 |
30204 | 126476 | 41202 | 21202 | 20000 | 10100 | 20000 | 2382436 | 2252308 | 30100 | 10200 | 20000 | 20200 | 40000 | 21100 | 20000 | 10100 |
Result (median cycles for code): 12.9761
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
30025 | 132240 | 41578 | 21542 | 20036 | 10040 | 20000 | 2456248 | 2313083 | 30010 | 10020 | 20000 | 20020 | 40000 | 21273 | 20000 | 10010 |
30024 | 129761 | 41283 | 21283 | 20000 | 10010 | 20000 | 2456248 | 2313083 | 30010 | 10020 | 20000 | 20020 | 40000 | 21273 | 20000 | 10010 |
30024 | 129761 | 41283 | 21283 | 20000 | 10010 | 20000 | 2456248 | 2313083 | 30010 | 10020 | 20000 | 20060 | 40080 | 21294 | 20000 | 10010 |
30024 | 129800 | 41280 | 21280 | 20000 | 10010 | 20000 | 2456248 | 2313083 | 30010 | 10020 | 20000 | 20020 | 40000 | 21273 | 20000 | 10010 |
30024 | 129776 | 41287 | 21287 | 20000 | 10010 | 20000 | 2456283 | 2313144 | 30010 | 10020 | 20000 | 20020 | 40000 | 21273 | 20000 | 10010 |
30024 | 129761 | 41283 | 21283 | 20000 | 10010 | 20000 | 2456248 | 2313083 | 30010 | 10020 | 20000 | 20020 | 40000 | 21274 | 20000 | 10010 |
30024 | 129761 | 41283 | 21283 | 20000 | 10010 | 20000 | 2456248 | 2313083 | 30010 | 10020 | 20000 | 20020 | 40000 | 21273 | 20000 | 10010 |
30024 | 129755 | 41261 | 21261 | 20000 | 10010 | 20050 | 2400885 | 2264493 | 30088 | 10048 | 20056 | 20020 | 40000 | 21278 | 20000 | 10010 |
30024 | 129734 | 41256 | 21256 | 20000 | 10010 | 20000 | 2455447 | 2312441 | 30010 | 10020 | 20000 | 20020 | 40000 | 21273 | 20000 | 10010 |
30024 | 129761 | 41283 | 21283 | 20000 | 10010 | 20000 | 2456248 | 2313083 | 30010 | 10020 | 20000 | 20020 | 40000 | 21273 | 20000 | 10010 |