Apple Microarchitecture Research by Dougall Johnson M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions M1/A14 E-core (Icestorm): Overview | Base Instructions | SIMD and FP Instructions
Code:
steor x0, [x6] nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop
mov x0, 0
(no loop instructions)
Retires (minus 70 nops): 3.000
Issues: 3.002
Integer unit issues: 1.003
Load/store unit issues: 2.000
SIMD/FP unit issues: 0.000
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
73005 | 34294 | 3016 | 1012 | 2004 | 1002 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34092 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34072 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34078 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34080 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34081 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34080 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34069 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34075 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34077 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
Code:
steor x0, [x6] add x6, x6, 8
(fused SUBS/B.cc loop)
Result (median cycles for code): 3.0063
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
40211 | 31186 | 40653 | 20491 | 20162 | 20323 | 20007 | 116635 | 107647 | 40114 | 20207 | 20007 | 30211 | 40013 | 20010 | 20000 | 20100 |
40204 | 30063 | 40110 | 20108 | 20002 | 20104 | 20004 | 116750 | 107582 | 40108 | 20204 | 20004 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30063 | 40110 | 20108 | 20002 | 20104 | 20004 | 116754 | 107592 | 40108 | 20204 | 20004 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30063 | 40110 | 20108 | 20002 | 20104 | 20004 | 116749 | 107584 | 40108 | 20204 | 20004 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30063 | 40110 | 20108 | 20002 | 20104 | 20038 | 111900 | 118908 | 40176 | 20238 | 20038 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30063 | 40110 | 20108 | 20002 | 20104 | 20004 | 116747 | 107576 | 40108 | 20204 | 20004 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30063 | 40110 | 20108 | 20002 | 20104 | 20004 | 116746 | 107576 | 40108 | 20204 | 20004 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30063 | 40110 | 20108 | 20002 | 20104 | 20038 | 111241 | 119741 | 40176 | 20238 | 20038 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30065 | 40111 | 20109 | 20002 | 20104 | 20004 | 116996 | 107697 | 40108 | 20204 | 20004 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30063 | 40110 | 20108 | 20002 | 20104 | 20004 | 116653 | 107410 | 40108 | 20204 | 20004 | 30206 | 40008 | 20010 | 20000 | 20100 |
Result (median cycles for code): 3.0066
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
40029 | 30812 | 40389 | 20295 | 20094 | 20164 | 20004 | 116616 | 107641 | 40018 | 20024 | 20004 | 30031 | 40013 | 20010 | 20000 | 20010 |
40024 | 30066 | 40017 | 20017 | 20000 | 20010 | 20000 | 116606 | 107634 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30066 | 40017 | 20017 | 20000 | 20010 | 20000 | 116603 | 107628 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30066 | 40017 | 20017 | 20000 | 20010 | 20000 | 116597 | 107616 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30066 | 40017 | 20017 | 20000 | 20010 | 20000 | 116615 | 107652 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30066 | 40017 | 20017 | 20000 | 20010 | 20000 | 116598 | 107618 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30066 | 40017 | 20017 | 20000 | 20010 | 20000 | 116615 | 107652 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30066 | 40017 | 20017 | 20000 | 20010 | 20038 | 113508 | 118333 | 40086 | 20058 | 20038 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30066 | 40017 | 20017 | 20000 | 20010 | 20000 | 116613 | 107648 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30066 | 40017 | 20017 | 20000 | 20010 | 20000 | 116545 | 107511 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
Code:
steor x0, [x6]
mov x7, 8
(fused SUBS/B.cc loop)
Result (median cycles for code): 12.9754
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule simd uop (54) | schedule ldst uop (55) | dispatch int uop (56) | dispatch simd uop (57) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? simd retires (ee) | ? int retires (ef) |
30205 | 130486 | 41452 | 21414 | 0 | 20038 | 10130 | 0 | 20000 | 2454321 | 2317483 | 30100 | 10200 | 20000 | 20200 | 40000 | 21310 | 20000 | 0 | 10100 |
30204 | 129761 | 41410 | 21410 | 0 | 20000 | 10100 | 0 | 20000 | 2449943 | 2311807 | 30100 | 10200 | 20000 | 20200 | 40000 | 21310 | 20000 | 0 | 10100 |
30204 | 129754 | 41409 | 21409 | 0 | 20000 | 10100 | 0 | 20000 | 2449943 | 2311807 | 30100 | 10200 | 20000 | 20200 | 40000 | 21309 | 20000 | 0 | 10100 |
30204 | 129754 | 41409 | 21409 | 0 | 20000 | 10100 | 0 | 20000 | 2449943 | 2311807 | 30100 | 10200 | 20000 | 20200 | 40000 | 21309 | 20000 | 0 | 10100 |
30204 | 129754 | 41409 | 21409 | 0 | 20000 | 10100 | 0 | 20000 | 2449837 | 2311753 | 30100 | 10200 | 20000 | 20200 | 40000 | 21309 | 20000 | 0 | 10100 |
30204 | 129792 | 41396 | 21396 | 0 | 20000 | 10100 | 0 | 20000 | 2450250 | 2312113 | 30100 | 10200 | 20000 | 20200 | 40000 | 21309 | 20000 | 0 | 10100 |
30204 | 129754 | 41409 | 21409 | 0 | 20000 | 10100 | 0 | 20000 | 2450048 | 2311933 | 30100 | 10200 | 20000 | 20200 | 40000 | 21309 | 20000 | 0 | 10100 |
30204 | 129761 | 41410 | 21410 | 0 | 20000 | 10100 | 0 | 20000 | 2449943 | 2311807 | 30100 | 10200 | 20000 | 20200 | 40000 | 21309 | 20000 | 0 | 10100 |
30204 | 129761 | 41410 | 21410 | 0 | 20000 | 10100 | 0 | 20000 | 2449943 | 2311807 | 30100 | 10200 | 20000 | 20200 | 40000 | 21310 | 20000 | 0 | 10100 |
30205 | 127274 | 41160 | 21114 | 0 | 20046 | 10130 | 0 | 20000 | 2450048 | 2311933 | 30100 | 10200 | 20000 | 20200 | 40000 | 21309 | 20000 | 0 | 10100 |
Result (median cycles for code): 12.9761
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule simd uop (54) | schedule ldst uop (55) | dispatch int uop (56) | dispatch simd uop (57) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | map simd uop inputs (81) | ? int output thing (e9) | ? ldst retires (ed) | ? simd retires (ee) | ? int retires (ef) |
30025 | 130155 | 41291 | 21255 | 0 | 20036 | 10040 | 0 | 20200 | 2423223 | 2285182 | 30324 | 10134 | 20226 | 20062 | 40085 | 0 | 21285 | 20000 | 0 | 10010 |
30024 | 126403 | 41104 | 21104 | 0 | 20000 | 10010 | 0 | 20000 | 2456248 | 2313083 | 30010 | 10020 | 20000 | 20020 | 40000 | 0 | 21273 | 20000 | 0 | 10010 |
30024 | 129761 | 41283 | 21283 | 0 | 20000 | 10010 | 0 | 20000 | 2456248 | 2313083 | 30010 | 10020 | 20000 | 20020 | 40000 | 0 | 21273 | 20000 | 0 | 10010 |
30024 | 129761 | 41283 | 21283 | 0 | 20000 | 10010 | 0 | 20000 | 2456248 | 2313083 | 30010 | 10020 | 20000 | 20076 | 40110 | 0 | 21042 | 20000 | 0 | 10010 |
30024 | 129726 | 41283 | 21283 | 0 | 20000 | 10010 | 0 | 20000 | 2456248 | 2313083 | 30010 | 10020 | 20000 | 20020 | 40000 | 0 | 21273 | 20000 | 0 | 10010 |
30024 | 129766 | 41281 | 21281 | 0 | 20000 | 10010 | 0 | 20198 | 2391390 | 2256879 | 30321 | 10133 | 20224 | 20020 | 40000 | 0 | 21273 | 20000 | 0 | 10010 |
30025 | 127735 | 41096 | 21050 | 0 | 20046 | 10038 | 0 | 20000 | 2454873 | 2311895 | 30010 | 10020 | 20000 | 20062 | 40085 | 0 | 21282 | 20000 | 0 | 10010 |
30024 | 129761 | 41283 | 21283 | 0 | 20000 | 10010 | 0 | 20000 | 2456248 | 2313083 | 30010 | 10020 | 20000 | 20020 | 40000 | 0 | 21273 | 20000 | 0 | 10010 |
30024 | 129761 | 41283 | 21283 | 0 | 20000 | 10010 | 0 | 20000 | 2456248 | 2313083 | 30010 | 10020 | 20000 | 20020 | 40000 | 0 | 21273 | 20000 | 0 | 10010 |
30024 | 129761 | 41283 | 21283 | 0 | 20000 | 10010 | 0 | 20196 | 2443729 | 2302509 | 30319 | 10133 | 20223 | 20020 | 40000 | 0 | 21279 | 20000 | 0 | 10010 |